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Abstract 


We investigate connections between SAT (the propositional satisfiability 
problem) and combinatorics, around the minimum degree of variables in vari- 
ous forms of redundancy-free boolean conjunctive normal forms (clause-sets). 

Let pvd(F’) € N for a clause-set F denote the minimum variable-degree, 
the minimum of the number of occurrences of a variable. A central result is 
the upper bound o(F) + 1 < pvd(F) < nM(o(F)) < o(F) +14 log, (o(F)) 
for lean clause-sets F € LEAN in dependency on the surplus o(F’) € N. Lean 
clause-sets, defined as having no non-trivial autarkies (partial assignments 
satisfying some clauses and not touching the other clauses), generalise mini- 
mally unsatisfiable clause-sets, i.c., LEAN D MU. For the surplus we have 
o(F) < 6(F) =c(F) — n(F), using the deficiency 6(F’) of clause-sets, the dif- 
ference between the number c(F’) of clauses and the number n(F’) of variables. 
nM(k) € N is the k-th “non-Mersenne” number, skipping in the sequence of 
natural numbers all numbers of the form 2”—1. As an application of the upper 
bound we obtain, that clause-sets F' violating uvd(F’) < nM(o(F)) must have 
a non-trivial autarky, so clauses can be removed satisfiability-equivalently. We 
obtain a polynomial time autarky reduction, but where it is open whether such 
an autarky itself can be found in polynomial time. 

We show that the upper bound is sharp, i.e., wvd(LEANs=n) = nM(k) 
for all deficiencies k € N, where pvd(LEAN5=;) is the maximum of pvd(F’) 
over F € LEANs=~%. The determination of uvd(MUs=~) =: unM(k) seems 
to be a much more involved question. We show that for k < 5 we have 
pnM(k) = nM(k), but for k = 6 we have wnM(k) = nM(k) —1. Moreover this 
correction by —1 causes further corrections by —1 for infinitely many other 
deficiencies, resulting in the upper-bound function nM; : N > N, an instance 
of a generalised non-Mersenne function found by a novel recursion scheme. 

Extensive introductions, overviews, conclusions, examples and open prob- 
lems are provided. 
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1 Introduction 


In this work we aim at bringing together some aspects of combinatorics with the 
developing theory of SAT. We concentrate on degree considerations in “clause-sets” 
(conjunctive normal forms as set-systems), which can be considered as generalised 
hypergraphs, namely hypergraphs with “polarities”. The general goal is to develop 
an understanding of propositional (un)satisfiability, which corresponds for hyper- 
graphs to an understanding of (non-)2-colourability. 

SAT, the prototypical NP-complete problem (it), took a strong development 
in the past two decades also regarding (industrial) applications (see the handbook 
ial for a recent overview). It is often mainly considered as belonging to complex- 
ity theory, algorithms and heuristics (with (isl the basic papers here), and 
finally implementations and experimentation (“SAT solvers”). “Understanding” 
SAT in a precise sense is considered to be impossible, and only various investiga- 
tions on random and approximation structures (including “islands of tractability” ) 
in general are deemed fruitful. We want to challenge this view, starting to build 
a new bridge, towards an understanding of unsatisfiability. We note here that 
understanding unsatisfiability seems easier than to understand satisfiability, since 
unsatisfiability means a form of completion, all assignments have been excluded as 
potential satisfying assignments (“models”), while satisfiability means a lack of such 
completion. More precisely, we aim at understanding minimal unsatisfiability, the 
building blocks of unsatisfiability — similar to critical colourability, here removal 
of any clause renders the clause-set satisfiable. 

A fundamental question, the subject of this study, is the existence of “simple” 
variables in clause-sets. “Simple” here means a variable occurring not very often. 
A major use of the existence of such variables is in inductive proofs of properties 
of minimally unsatisfiable clause-sets, using splitting on a variable to reduce n, the 
number of variables, to n — 1: here it is vital that we have control over the changes 


imposed by the substitution, and so we want to split on a variable occurring as few 
times as possible. “Splitting” of a clause-set F' on variable v means the consideration 
of the clause-sets (v > 0) * F, (v > 1) * F, that is, instantiating variable v by both 
truth values 0,1. A feature of clause-sets is the closure under splitting, and splitting 
is a major tool for investigations into minimal unsatisfiability. 


1.1 Deficiency as the main structural parameter 


The definition of the class CLS of “clause-sets”, and of the class MU Cc CLS of 
“minimally unsatisfiable clause-sets” , can be quickly (and precisely) given as follows, 
using (just) natural numbers as “variables”: 

A “literal” x is an element of Z \ {0}. A “clause” C is a finite set of literals, 
such that there is no « € C with —x € C. Using —L:= {—x: a € L} for sets L of 
literals, the “clash-freeness” condition for C becomes CN —C = @. A “clause-set” 
F is a finite set of clauses, the set of all clause-sets is denoted by CLS. The most 
basic measurements for F' € CLS are: 


e the number c(F’) := |F'| € No of clauses of F; 


e the number n(F’) := |var(F’)| € No of variables of F’, where var(F’) is the set 
of v € N (variables as positive integers) with {v, -v}N UF £4 9; 


e The “deficiency” 6(F') := c(F)—n(F) € Z. This parameter is only informative 
when certain (weak) assumptions are made for F,, and for general F’ the 
“maximal deficiency” 6*(£’) := maxp’cr 6(F”) € No is to be used. 


A clause-set F' is “satisfiable” if there exists a partial assignment y, which here in 
this introduction is just a clause, such that pM D # 0 for all Dé FP The set of 
all satisfiable clause-sets is SAT C CLS, the set of all unsatisfiable clause-sets is 
USAT :=CLS\ SAT. Finally MU C USAT is the set of F € USAT such that 
for all C € F we have F \ {C} € SAT. 

The background for the investigations of this report is the enterprise of classi- 
fying F € MU in dependency on 6(F’). The basic facts are 0*(F’) = 6(F) (as will 
be discussed in Subsection f1.9), and the well-known 6(F’) > 1, as first shown in Bi. 
For 6(F’) = 1 the structure is completely known (Bh 17, 9}; see Example B.2)), for 
6(F) = 2 the structure after reduction of singular variables (occurring in one sign 
only once) is known (|42]; see Example B.3), while for d(F) € {3,4} only basic cases 
have been classified ((99]). 

The starting point of our investigation is Lemma C.2 in bol, where it is shown 
that F € MU with n(F) > 0 must have a variable v € var(F’) with at most 6(F) 
positive and at most 6(F’) negative occurrences; we write this as ldp(v) < 6(F’) and 
Idp(—v) < 6(F), using the notion of literal degrees (the number of occurrences of 
the literal), where for a literal x its degree is 


Idp(x) := {Ce F: x EC} ENo. 
Thus we have vdr(v) < 26(F), using the variable degree 

vdr(v) := lde(v) + ldr(—v) € No. 
Using the minimum variable degree (min-var-degree) 


pvd(F'):= min vdpr(v) EN 
ve€var(F’) 


1) The clause y is the set of satisfied literals of the corresponding “partial assignment”. This def- 
inition of “satisfying assignments”, via clauses intersecting every clause of F’, generalises transver- 
sals of hypergraphs, by taking complementation into account (y does not contain clashes). 


of F with n(F’) > 0, the upper bounds becomes pvd(F) < 26(F’). A main theme 
of this report is the consideration of wvd(MUs=,) € N for & € N, the maxi- 
mum of pvd(F) for F € MU with 6(F) = k. The upper bound now becomes 
pvd(MUs=z) < 2k. 

We show a sharper bound on pvd(F’), namely we show that the worst-cases 
Idp(v),lde(—v) < 6(F) can not occur at the same time (for a suitable variable), 
but actually ldp(v) + lde(%) — 6(F) only grows logarithmically in 6(F'). The really 
interesting aspect here is the precise determination of pvd(MUs=,), and we inves- 
tigate the (elementary) number-theoretic function nM(k), which yields the upper 
bound pvd(MUs=;,) < uM(k) for all k € N, where the function nM : N > N fulfils 
k + |logs(k + 1)| <nM(k) < k +14 |log,(k)| fork EN. 


1.2 Refining deficiency by surplus 


After having settled this basic min-var-degree upper bound for MU;—;, we show a 
sharper bound on pvd(F’) for a larger class of clause-sets F: 


e The larger class of clause-sets considered is the class LEAN of lean clause-sets 
(introduced in a), which are clause-sets having no non-trivial autarky. For 
an overview on the theory of minimally unsatisfiable clause-sets and on the 
theory of autarkies see [43]. LEAN C CLS is the set of F € CLS such that 
there is no partial assignment y (a “non-trivial autarky”) with the properties 


— for every clause D € F with -p ND #0 we have yn D # @ (note that 
this generalises the satisfaction criterion); 


— there exists v € var(F’) with {v, -v} Ny 490. 


Note LEAN NSAT = T, where T := @ € CLS is the empty clause-set (the 
standard satisfiable clause-set). 


e The deficiency 6(F') € Z is strengthened by the surplus o(F’) € Z, defined in 
case of n(F’) > 0 as follows. 


Consider the bipartite clause-variable graph of F' (generalising the incidence 
graph of a hypergraph), with the clauses C € F on one side of the biparti- 
tion, and the variables v € var(F’) on the other side, and an edge between 
v and C if {v,-—v}NC #4 @. The “expansion” of a set 6 4 V C var(F) 
of variables is |[(V)| — |V|, where I'(V) is the set of neighbours of V (inci- 
dent clauses), and the surplus then is the minimum expansion, i.e., o(F’) = 
mingzyCvar(F)|I'(V)| = lV |. 

In the terminology of (73, Section 1.3], 6*(F’) is the deficiency of the bipartite 
clause-variable graph (with bipartition (F, var(F’)), while o(F’) is the surplus 
of the bipartite variable-clause graph (with bipartition (var(F’), F’)). 


Note that by considering V = var(F’) we have o(F’) < 6(F), and by consider- 
ing V = {v} for v € var(F’) we get o(F) < pvd(F) — 1. 


We have o(F’) > 1 for F € LEAN with n(F) > 0 (61) Lemma 7.7]), general- 
ising the basic fact 6(F’) > 1 for Fe MU. 


Now a central result of this report (Theorem 0.9) is 
pwd (F) < aM(o(F)) 


for F € LEAN with n(F) > 0. As an application we obtain (Theorem [{10.), 
that via removing satisfiability-equivalently some clauses (via some autarky), we 
can reduce every (multi-)clause-set F’ in polynomial time to a (multi-)clause-set F’ 


containing a variable occurring with degree at most o(F”’) + 1+ log,(a(F’)). It is 
an open problem whether such an autarky can be found in polynomial time (for 
arbitrary clause-sets F’); we conjecture (Conjecture {L0.3)) that this is possible. 

We also show sharpness of the upper bound, ie., wvd(LEANs=%) = nM(k) 
for all k € N, in Corollary (proving Conjecture 23 from the conference version 
|62]), which indeed holds for every class of clause-sets between VMU, i.e., “variable- 
minimally unsatisfiable clause-sets” as introduced in (taj, and LEAN; the definition 
of VMU is as the set of F € USAT such that for all F’ C F with var(F") C var(F) 
we have F’ € SAT. 

We then come back to the special case of minimal unsatisfiability. Here things 
turn out to be much more complicated, and the numbers wnM(k) := wvd(MUsex) € 
N for k € N, the guaranteed minimum variable degrees for minimally unsatisfiable 
clause-sets of deficiency k, seem to be very complicated (and very interesting) quan- 
tities. We proof the sharpened bound wnM(k) < nMj(k), which improves on nM(k) 
for infinitely many k. 

According to the goal of bringing different communities together, we try to 
provide and explain much of the relevant background, so that this report is mostly 
self-contained, and the results cited from the literature can be treated as black- 
boxes. 


1.3. Some basic intuitions about the upper bound nM 
As already mentioned, the function nM : N > N is strictly increasing with range 
nM(N) =N\{2"-1l:neEN}={2, 4,5,6, 8,...,14, 16,17,...}. 


We show pvd(LE ANs=%) = nM(k) for deficiencies k € N, that is, every lean clause- 
set F with n(F’) > 0 contains a variable v € var(F’) with vdr(v) < nM(6(F)), and 
for every deficiency k > 1 there are lean clause-sets F' with pvd(F’) = nM(6(F)). 

The underlined values 2,6,14,..., which have the form 2” — 2 for n € N, are the 
function values at the “jump positions” 1,4,11,..., which are of the form 2” —n-—1 
for n > 2 (where the function values changes by +2, while otherwise it changes 
by +1 for an increment of the argument). This basic structure of nM can be 
motivated by the following constructions of F € MU with “high” min-var-degree; 
indeed these considerations only concern the lower bounds, given by appropriate 
constructions, while the arithmetic nature of nM(k) rests on different considerations, 
but for the deficiencies considered here, lower and upper bounds are equal, and the 
lower bounds are easier to understand here. 

The basic clause-sets are the A,, for n € No, which consist of all 2” sets (clauses) 
of numbers +1,. mn, using the natural numbers 1 ,n as variables. So Ap = 
{0}, Ai = {{- 7 {a}, Ag = {{1, 2}, {-1, 2}, {1, 2}, a 1,—2}} and so on. It is 
easy to see that we have A, € MU with ntdy ) =n, c(An 2 = 2” = pvd(A,), and 
d(An) = 2" —n. We will see that the A, have the largest possible min-var-degree 
2” for given deficiency 2” — n, and we also have nM(2” — n) = 2” for n € N. These 
deficiencies k = 2”—n (numerical values are 1, 2,5,12,...) are the positions directly 
after the jump positions (excluding deficiency k = 1 as a special case). 

How can we obtain from that more clause-sets in MU with high min-var-degree? 
Consider A3: we have e.g. {1, 2,3}, {1, 2, -3} € A3; now logically these two clauses 
are equivalent to {1,2} (ie., we have the same satisfying assignments; technically, 
a “strict full subsumption resolution” is performed), and we obtain AS := (A3 \ 
{{1, 2,3}, {1, 2, -3}}) U {{1, 2}} © MU. Performing this process in general, using 
{1,...,n},{1,...,n2—1,—n} € An, yields Ai, © MU for n > 2, with n(A’i,) =n 
c(Al,) = 2" — 1, 6(Al,) = 2" —n—-1, and pvd(Ai,) = 2” — 2 (the (single) variable 
with minimum occurrences is n). These deficiencies are precisely the jump positions 
2” —n —1, and accordingly we have nM(2” —n —1) = 2” — 2. 


Performing the same trick again to A5, we might replace {—1, 2,3}, {—1, —2,3} € 
Al, by {-1,3}, obtaining AJ ¢ MU. Again for general n > 3 we get A” © MU, 
n(A”") =n, c(Al’) = 2" — 2, 6(A”) = 2" — n — 2, and pvd(A"’) = 2” — 3; note 
here the crucial difference, that the min-var-degree has only been changed by —1. 
The reason is that there are two variables now with minimum occurrences, namely 
n —1,n, where the degree of variable n changed first by —2, then by —1, while for 
variable n — 1 the degree first changed by —1, and then by —2 (and for the other 
variables 1,...,— 2 we had degree changes by —1, —1). 

Now one might imagine this process of strict full subsumption resolution contin- 
uing until deficiency 2”~! — (n — 1) +1, always with change of the min-var-degree 
by —1, just before the deficiency of the previous A,_; — this would yield the 
function nM. However the combinatorial reality is more complicated, and as we 
prove in this report (Section (14), at least we can not get until 2"~! —(n-—1)+1 
for n > 4 (in effect), that is, at these deficiencies k = 6,13,28,... we have 
pvd(MUs=z) < nMy(k) = nM(k) — 1P)] 


1.4 Related work on MU 


A general overview on minimally unsatisfiable clause-sets (also “minimal unsatis- 
fiable clause-sets/formulas”, or “MU”) is ba; later developments are in 
(generalisations to non-boolean clause-sets) and in (studying “singular DP- 
reduction”, the elimination of variables which occur in one sign only once). 

Two early papers on the complexity aspects are Bi, Bd), who introduced the 
complexity class D? and showed that the decision “F € MU ?” with input 
F € CLS is complete for this class. Another important early paper is Bi. which 
showed 6(F') > 1 for F € MU, where the notion of “deficiency” was introduced by 
[25]. Furthermore Bi showed polytime-decision of the sub-class SMUs21 C MUs21 
(called “strongly minimal unsatisfiable” there), where SMU Cc MU is the set of 
F E€USAT such that for all C € F and all x € Z\ {0} with {2, -x} NC = @ holds 
(F\ {C})U{CU {x}} € SAT, that is, adding any literal to any clause renders the 
clause-set satisfiable. We use the terminology “saturated minimally unsatisfiable” 
introduced in b4, where the important connection to splitting was introduced, and 
a simpler proof of 6(£) > 1 for F € MU was given. Just for this introduction we 
handle “partial assignments” via clauses y (containing the satisfied literals; thus 
—y is the set of falsified literals), so for a literal x the partial assignment (a — 0) 
is given by {—a}, while (x > 1) is given by {x}. The application of y to F € CLS 
is defined as 

px F:={D\-C:DEFACND=B}ECLS, 


that is, removing first the satisfied clauses from F’, and then the falsified literals from 
the remaining clauses. Now for F € CLS holds F € SMU iff for all. x € Z\ {0} 
holds (c > 1) * F € MU (the “only if”-direction was shown in b4, the “if”- 
direction in i) Due to this property plus the property, that every F € MU can 
be “saturated” by adding literals to clauses, the class SMU is an important helping 
class for investigations into MU via the splitting method, splitting up F «© MU 
into (v > 0) * F and (v > 1) * F for selected variables v. 

We have already mentioned the literature concerned with characterising the 
classes MU, (and subclasses) for small deficiencies k < 4. Less ambitious is the 
goal of polytime decision of these classes: the problem was raised in bal, and has 
been solved via two independent approaches in ho and [P3] (indeed establishing 
polytime SAT decision for inputs F' € CLS and fixed 6*(F)), later strengthened in 


2) We do not apply the above method for gaining lower bounds as far as we can, but only as 
needed in this report; see the end of Subsection for some further remarks. 


[93] (showing that SAT decision is even fixed-parameter tractable in 6*(F'); see also 
[57] for generalisations and simplifications). 


1.4.1 MUS 


As we have already mentioned, we consider MU as the “primal” building block 
for understanding unsatisfiability. In general an unsatisfiable clause-set can contain 
many minimally unsatisfiable sub-clause-sets, called “MUSs”. The task of enumer- 
ating all of them or at least some “good” ones is also of practical importance, to 
extract more information on the “causes” of unsatisfiability. A recent overview is 
74], while a clean approach to enumerate all MUSs, via hypergraph transversals, is 
in [71] (the earliest appearance of the underlying observation seems fd, Theorem 2]; 
compare also Subsection 4.3] for generalisations of the fundamental approach). 
See also for a reflection on various types of such sub-clause-sets, and on the 
connection to autarky theory (compare Subsection f1.6.3). 


1.4.2 Tovey’s problem (uniform clause-sets) 


This report appears to be the first systematic study of the problem of minimum 
variable occurrences / degrees in minimally unsatisfiable clause-sets and generali- 
sations, in dependency on the deficiency — asking for the existence of a variable 
occurring “infrequently” in general, or for extremal examples where all variables 
occur not infrequently. The “dual” problem is to consider maximum variable oc- 
currences / degrees — asking for the existence of a variable occurring frequently in 
general, or for extremal examples where all variables occur not frequently. More 
precisely, the maximum variable degree is 


vvd(F) := sae vdr(v) EN, 


for n(F’) > 0, while for a class C C CLS of clause-sets, the quantity vvd(C) (to be 
studied) is the minimum of vvd(F’) for F € C. 

This problem has been well-studied for p-uniform minimally unsatisfiable clause- 
sets, starting with p4, fi, h7.P We denote by p-CLS C CLS for p € No the set of 
all F € CLS with VC € CLS: |C| < p, while by UCLS C CLS we denote the set of 
all uniform clause-sets, i.e., those F’ € CLS such that for C,D € F, C # D, holds 
|C| = |D|. Finally p-UCLS := p-CLSNUCLS and p-UMU := p-UCLSAMU. Now 
the basic fact is 

vvd(p-UMU) > p+1 


for p € N ((P4j, generalised in Corollary 7.3]). Trivially vvd(1-UMU) = 2, 
and easily one sees vvd(2-UMU) = 3, while by 4 holds vvd(3-UMU) = 4. As 
reported in Ba. we have vvd(4-UMU) = 5, and these are all known precise values 
of vvd(p-UMU) (where the notation f(p) := vvd(p--4MU) — 1 was introduced in 
7). In it was observed that extremal examples might be found in MU 1, and 
this work was recently extended in [R8|, establishing the asymptotically tight bound 
limp+o0 22° /vvd(p-UMU) = 1 (where indeed p-UMU NM MUs=1 is considered). 
In our setting, studying the classes MU; =,, the max-var-degree is not very 
relevant, since we have vvd(MU5=1) = 2, while vvd(MUs=,,) = 3 for k > 2. This 
can be seen as follows: As already noticed in (94), there is a poly-time transformation 
from CLS to the class CLS(1,2) C CLS, consisting of those F € CLS where for 
every variable v € var(F’) we have ldp(v) = 1 and ldg(—v) < 2. Namely if there 


3) We remark that typically in the literature the connections to minimally unsatisfiable clause- 
sets are not emphasised, but it is clear that when considering (uniform) unsatisfiable clause-sets 
with a maximum variable degree as small as possible, then one can restrict attention to (uniform) 
minimally unsatisfiable clause-sets (as worst-cases). 


is a literal x and two clauses C,D € F with « € C/N D, then we can introduce a 
new variable v, replace « in C, D by v, and add the new clause {—v, x}, obtaining 
F',. We study such extensions under the name of “singular DP-extension”, but 
it is also easy to see directly that F’ is satisfiable iff F is, that F’ is minimally 
unsatisfiable iff F is, and that 6(F’) = 6(F). By repeating this transformation, we 
obtain t!? : CLS + CLS(1,2). So for F € MU we get t?(F) € MU NCLS(1, 2) 
with d(¢(F)) = 6(F). Whence for all k € N we have vvd(MUszx) < 3. Now 
trivially vvd(MUs=1) = 2 due to {{1},{-1}} © MUsa1. On the other hand, if 
for F € MU holds vvd(F) < 2 (thus vvd(F’) = 2), then via so-called singular 
DP-reduction this clause-set can be reduced to {}, whence F € MU5 =) (this is 
well-known; compare Example later). 

So for the study of the max-var-degree, the uniformity restriction seems essential. 
This is similar to many investigations into (colour-)critical hypergraphs (discussed 
in Subsection[1.6.1] below), where uniformity is a crucial assumption, and the clause- 
length p is the main parameter. For investigations into the case of uniform (general) 
clause-sets, where clauses share at most one variable, see Ba). The number of 
clauses in F € p-UMU has been studied in (69), showing that for p = 2 holds 
c(F’) < 4n — 2, while for p > 3 there are F' with c(F) = Q(n(F)?). Finally, the 
number of conflicts (clashes) in F € p-U4M4U is considered in &. and for a review 
of the use of the Lovész Local Lemma in this context see [7]. 

In contrast, for the study of the minimum variable degree as in this report, in 
dependency on the deficiency, the restriction to uniformity seems not interesting, 
and is also not needed, but unrestricted clause-sets are considered. We remark 
that for every p € N, p > 3, there is a polytime translation t, : CLS > p-UCLS, 
such that t,(F’) is satisfiable iff F is, tp(F') is minimally unsatisfiable iff F’ is, and 
6(tp(£)) = 6(F). This works by replacing clauses C with |C| < p by clauses 
CU {vu}, CU {—v} for some new variable v (in the MU-case we will call this a “non- 
strict full subsumption extension”), and by replacing clauses C' with |C| > C' by 
clauses C’ U {v}, C” U {—v} for some new variable v and choosing clauses C’,C” 
with C = C’UC” and |C"| = p—1, |C”| > p—1 (in the MU-case again we have 
a singular DP-extension). But the transformation t, appear to be useless, since it 
completely garbles the structure of F’. 

We conclude these remarks on p-uniform clause-sets by the observation, that 
for p > 4 the instances involved above become quickly very big, and only methods 
from random analysis are available (which by nature are very rough). It seems that 
these considerations do not have practical relevance. In contrast, we consider all 
minimal unsatisfiable clause-sets (and more), that is, the deficiency does not filter 
out clause-sets, but only organises them in layers. And for a wide range of deficiency 
values, say, k = 1,..., 10000, there are interesting and relevant examples. 


1.5 Autarkies 


An important tool, used in this report to go beyond MU, is the theory of autarkies, 
which also provides a strong link to various areas of combinatorics; the relations to 
hypergraph colouring will be discussed in Subsection [1.6.3}. Recall that a partial 
assignment y is an autarky for F € CLS iff every clause C € F touched by y 
(ie, PN (CU-C) 4 ) satisfies C (i.e, ep AC #4 OG), which is equivalent to 
VF' CF: ox F’ CF’. Autarkies were introduced in [rai for improved worst-case 
upper bounds for SAT decision, applying that for an autarky y obviously y * F' is 
sat-equivalent to F’. For a recent overview see 3}. 


Autarky reduction. Autarky reduction, the reduction of F € CLS toy* F € 
CLS for a non-trivial autarky y, is an essential concept, algorithmically as well as 
for theoretical understanding; see BS, Subsection 11.10] for an overview on finding 


autarkies. If we reduce all autarkies, then we obtain the (unique) lean kernel of F. If 
there are no non-trivial autarkies, then we have a lean clause-sets, i.e., F € LEAN, 
as already mentioned in Subsection [1.2 the concept was introduced in 69), and 3} 
Subsection 11.8.3] contains more information. The lean kernel of F' is the largest 
lean sub-clause-set of a clause-set, that is, U{F’ C F : F’ € LEAN}; for a recent 
paper on the computation of the lean kernel see (74). 

The decision of leanness is coNP-complete, and so consideration of special au- 
tarkies is of interest; actually, these considerations are not just “algorithmic hacks” , 
but in a sense represent various areas of combinatorics (for example matching the- 
ory) via “autarky systems”. 


Autarky systems. The notion of an “autarky system”, as a selection of special 
autarkies with similar good properties as general autarkies, was introduced in bi}. 
partially further expanded in 54). and overviewed in (h3} Subsection 11.11]. 

The starting point for an autarky system is to single out a restricted notion of 
autarky. This restricted autarky notion implies a restricted satisfiability notion, 
namely clause-sets satisfiable via autarky reduction, using only these special au- 
tarkies. This is indeed equivalent for “normal autarky systems” to the clause-set 
being satisfiable by a single special autarky/)] Often the general autarkies of the 
system can be derived from the extreme case of satisfiability through such autarkies. 
For arbitrary autarky systems also the notions “minimal unsatisfiability” and “lean” 
are defined, and are central properties. 

Balanced autarkies are an example of a rather general autarky system, the ba- 
sis for autarkies for hypergraph colouring; here for an autarky, touched clauses 
need not only have some satisfied literal, but also some falsified literal. The corre- 
sponding satisfiability notion is “NAE-satisfiability”, and will be further discussed 
in Subsection [1-6.3} 


Matching autarkies. The autarky system especially of importance in this report, 
besides the full system, is that of matching autarkies; for a short introduction see 
4, Subsection 11.11.2]. They yield the set MLEAN D> LEAN of matching-lean 
clause-sets, and the set MSAT C SAT of matching-satisfiable clause-sets (called 
“matched clause-sets” in B53): 


e A matching autarky for F € CLS is an autarky y for F' such that for all 
C € F touched by y one can select tc € C with « € y such that the 
underlying variables var(xc) are pairwise different. 


e We have F € MSAT & VF" CF: 6(F) <0, ie., 5*(F) = 0. 
e And FE MLEAN SV F' CF: 6(F") < 6(F). 


e Thus for F € MLEAN holds 6*(F) = 6(F), and for F # T holds 6(F) >1 
(note 6(T) = 0), a vast generalisation of this fact for MU. 


e More strongly, we have for F 4 T that F €e MLEAN © o(F) > 1. 


e Every F € CLS has a largest matching-lean sub-clause-set, the matching-lean 
kernel, namely U{F’ C F: F’ € MLEAN}, computable in polynomial time 
(for example via reduction by matching autarkies). 


4) “Normal autarky systems” were called “strong autarky systems” in lat Section 8]. 
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Linear autarkies. A stronger autarky system than matching autarkies is given 
by “linear autarkies” ; we will not use them for the results of this report, but they are 
an important link to combinatorics, and so we discuss them here; see Subsection 
11.11.3] for a more elaborated introduction. “Simple linear autarkies” for F € CLS 
have been introduced in bo). based on linear programming. For F € CLS we 
consider the clause-variable matrix M(F’), which is a c(F’) x n(F) matrix over R 
(or over Q for computational purposes), which encodes in the rows the clauses and 
in the columns the variables, by using 0 for absence of the variable, and +1 for 
positive resp. negative sign. Now the simple linear autarkies y are obtained from 
solutions 7 € R") of M(F)-z > 0, by translating the values 7;, where the indices 
i correspond to the variables of F’, into “unassigned” for #; = 0, “true” (i-e., 1) for 
; > 0, and “false” (i.e., 0) for Z; < 0. It is an easy exercise to see that this yields 
indeed autarkies. We have a non-trivial simple linear autarky iff M(F)-z> 0 has 
a non-trivial solution. We obtain the classes 


e LLEAN of “linearly lean clause-sets” (not having a non-trivial simple linear 
autarky), with LEAN Cc LLEAN C MLEAN; 


e LSAT of “linearly satisfiable clause-sets” (satisfiable by a sequence of simple 
linear autarkies), with MSAT C LSAT C SAT. 


Linear autarkies, as introduced in {51J, are obtained from simple linear autarkies 
by composition, corresponding to iterated reduction by simple linear autarkies; 
simple linear autarkies yield an autarky system, while linear autarkies yield a normal 
autarky system. The main point here is, that the reduction to the linearly-lean 
kernel can be done by a single linear autarky, and linearly satisfiable clause-sets are 
satisfiable by a single linear autarky. In Subsection we discuss the special case 
of “balanced linear autarkies”. 


1.6 Connections to combinatorics 


We now discuss the connections between SAT and combinatorics in a wider context 
than the degree considerations of this report, concentrating on aspects related to 
minimal unsatisfiability and autarkies (if one is only interested in the results of this 
report, then these discussions may be ignored). A general source on SAT is the 
handbook (fl; a classical connection to combinatorics, random satisfiability, is dis- 
cussed in Chapter 8 ((2]) there, and of further general interest to combinatorics is 
Chapter 10 (84) on symmetry (group theory), Chapter 13 ([87]) on fixed-parameter 
tractable problems (for example treewidth and related notions), and Chapter 17 
(LOG]) on the handling of various combinatorial designs to SAT solving, for ex- 
ample from Ramsey theory. Ramsey theory has strong connections to hypergraph 
colouring, which we discuss next; we mention, that applying SAT solving to solve 
hypergraph colouring problems is a powerful tool, and a recent overview can be 
found in ia) (where especially van-der-Waerden numbers are discussed). 


1.6.1 Hypergraph colouring 


Hypergraph-colouring, especially 2-colouring, and SAT are closely connected; see 
fig, Section 5] for a general introduction and overview on hypergraph colouring 
(from the combinatorial point of view), while a monograph is given with (ho). An 
overview especially on the question of the minimum number of hyperedges for a 
given number of vertices in non-k-colourable hypergraphs is given in 


Hypergraphs. For this introduction, a hypergraph G is a finite set of finite sub- 
sets of Z; so G itself is the set of hyperedges, i.e., E(G) := G, while UG is the 
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set of vertices, i.e., V(G) := UG. The set of all hypergraphs is denoted by HYP. 
Let the deficiency be dy(G) := |E(G)| — |V(G)|. Note that clause-sets are spe- 
cial hypergraphs (CLS C HYP), but their deficiency is defined differently. Hy- 
pergraphs G with dq(G) = 0 are called square hypergraphs. Special hypergraphs 
are the positive clause-sets, and the set of all positive clause-sets is denoted by 
PCLS := {F €CLS: UF CN} ={GEHYVP: V(G) CN}. For F € PCLS 
we have 6(F’) = dy(F); obviously every hypergraph can be renamed to a positive 
clause-set. From general clause-sets F' € CLS we obtain two hypergraphs: 


e F itself is a hypergraph (breaking the link between positive and negative 
literals, which are now just unrelated vertices). 


We note that we could have allows CLS = HYP, by allowing tautological 
clauses (i.e., clauses containing clashing literals) and self-complementary liter- 
als (—0 = 0). In certain contexts allowing such degenerations has advantages, 
but in our context is seems best to ban them (for example so we have a direct 
correspondence between clauses and partial assignments). 


e The “variable-hypergraph” of F is {var(C) : C € F} € PCLS. This formation 
for example is important to apply methods from matching theory. 


For positive clause-sets both formations collapse to the identity, and we treat posi- 
tive clause-sets as representing (general) hypergraphs by (special) clause-sets. 


Colouring. A k-colouring for k € No of Gisamap f : V(G) > {0,...,k—1} such 
that for all H € G there are x,y € H with f(x) 4 f(y); G is called k-colourable if 
there exists a k-colouring of G. Note that if there are H € G with |H| < 1, then 
G is not k-colourable for any k. A hypergraph G is critically k-colourable if G is 
k-colourable, not k—1-colourable, but for all H € G the hypergraph G'\ {H} is k—1- 
colourable. In the SAT-context there is no need to discard hyperedges containing 
at most one vertex, and then minimally non-k-colourability is more appropriate, 
that is G is not k-colourable (possibly not colourable at all), while after removal of 
any hyperedge G becomes k-colourable. The set of all minimally non-k-colourable 
hypergraphs is denoted by MNC* Cc HYP for k € No. We have {0}, {{x}} € 
MNC* for all k € No and x € Z. 

We are especially interested in MNC?. For G € MNC? holds 54y(G) > 0, as 
shown in Pd. and so we can consider the sets MN Cay for deficiencies k € No (all 
minimally non-2-colourable hypergraphs of deficiency (exactly) k) P The famous 
problem of deciding in polynomial time, whether a directed graph contains an even 
cycle, is equivalent to the problem of deciding “F €¢ MN Ce 9! for F © HYP 
(via simple transformations), and this problem was finally ple in (B4, (76. It 
was conjectured in [54], that i all k € No the classes MN Crs _, are decidable 
in polynomial time (see also 43, Conjecture 11.12.1]). More on this in Subsection 
[1.6.4 In (i one finds more i fcsmaton on vertex degrees in uniform elements of 
MNC3 suo (i.e., where all hyperedges have the same length). 


Translating hypergraphs into clause-sets. For a positive hypergraph G € 
PCLS we obtain the translation of 2-colouring to satisfiability via 
F,(G) := GU{-H:HeG}eCELS. 


For a general discussion of such translations, also considering more colours, see 63, 
Subsection 1.2]. A hypergraph G € PCCLS is 2-colourable iff F>(G) is satisfiable, and 


5) Indeed in by Corollary 8.2] it is shown 6y(G) > 0 for all GE MNC* for k > 2, as a simple 


application of the autarky method; note that for G := {{1,...,n}} € MNC®* for k <1 and n> 2 
holds 69(G) =1—-—n< 0. 
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G is minimally non-2-colourable iff F2(G) is minimally unsatisfiable, i.e., F2(G) € 
MU & G € MNC? (this is easy to prove, and a special case of Lemma 8.1]). 
Regarding the deficiency we have 6(F2(G)) = 6n(G) + |E(G)| for 0 ¢ G, and thus 
e.g. Fo (MNC§,, =o MPCLS) is not contained in any MUs=x for some k € N. 

A slight generalisation of the image F)(PCLS) under this translation is the 
class of complementation-invariant clause-sets Ff € CLS, characterised by C € F © 
—C € F for clauses C, as introduced in ba) (see also Subsection 11.4.5]), while 
the image F(PCLS) is the set of complementation-invariant PN-clause-sets, that 
is, clause-sets F’ where every clause C € F is positive (i.e., C C N) or negative 
(—C CN). See Subsection for how “autarkies”, as considered on F2(G), can 
help understanding G. 


Translating clause-sets into hypergraphs. In the other direction a translation 
was provided in (4. For F € CLS let 


e(F) := {CU {0}: C € F}U {{u, —v} : vu € var(F)} © HYP. 


The hypergraph e(F’) is 2-colourable iff F' is satisfiable, and F' is minimally unsat- 
isfiable iff e(F) is minimally non-2-colourable, i.c., e(F) € MNC? & F € MU (the 
direction “=” of the latter statement is stated in the proof of Theorem 3 in Bl. 
the other direction is (also) very easy). Furthermore dy(e(f’)) = 6(F’) — 1. Thus e 
embeds the classes MUs—, into MN‘ Craigs which motivates the conjecture, that 
all MN Cie , for k € No are polytime decidable, as a strengthening of the polytime 
decision of the MU;_;, for k € N (recall Subsection (1.4). 

We remark that via this embedding e we obtain a proof of 6(F’) > 1 for F €e MU 
from 6u(G) > 0 for G € MNC (this is one of the proofs given in Bh). In RI also an 
alternative proof of 64(G) > 0 is given, based on matching theory, plus one further 
proof of 6(F’) > 1, using linear algebra, as in Po. In Subsection [1.6.3] we will further 
comment on these proofs, as they are unfolded in the theory of “autarkies”. 

We also remark, that the hypergraph class e(/MUs=1) C MNC5,,-9 has the 
property, that every hypergraph in it different from {{0}} has a vertex of degree 
2 (since every F € MUs =, different from {0} has a variable of degree 2). More 
generally, for all k € N every hypergraph in e(MUs=x) \ {{{0}}} C MNC3 4-1 
has a vertex of degree at most k + 1. We do not know whether the minimum 
vertex-degrees of general G € MNC3,,_;, for any (fixed) k € No are bounded. 


1.6.2  Hypergraph transversals 


For G € HYP let Tr(G) € HYP, the transversal hypergraph of G, be defined as the 
set of all minimal T C V(G) such that TN H # Q for all H €E G. The Transversal 
Hypergraph Problem is the computational problem, given G,G’ € HYP, to decide 
whether Tr(G) = G’ holds. Equivalently, the input is G € HYP, and it is to be 
decided whether G = Tr(G) holds (obviously this is a special case of the Transversal 
Hypergraph Problem, and by a polynomial-time translation the general case can 
be reduced to it). For an overview on this important problem and its many guises 
see Pd. It is known that the problem is solvable in quasi-polynomial time, and the 
long outstanding problem is whether it can be solved in polynomial time. 

An intersecting hypergraph is a hypergraph G € HYP, such that for H, H’ € G 
with H # H' holds HN AH’ F 9, the class of all intersecting hypergraphs is denoted 
by ZHYP C HYP. By definition we have G C Tr(G) for G € ZHYP, and it is 
not hard to see that for G € ZHYP holds G € MNC? iff Tr(G) = G. Thus the 
Transversal Hypergraph Problem is equivalent to the problem, deciding whether 
an intersecting hypergraph is minimally non-2-colourable. The natural question 
arises for the decision of the classes (MNC MN ZHYP)>5,=% for k € No. The case 
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k = 0 has been handled in pol. indeed not just deciding the class in polynomial 
time, but efficiently classifying the elements. The cases k > 1 appear to be open, 
and whether decision is possible in polynomial time for fixed k, or is even fixed- 
parameter tractable (fpt) in &, is an interesting test case for the general Hypergraph 
Transversal Problem, as well as it is relevant for the understanding of minimally 
non-2-colourable hypergraphs. 

The translation of intersecting hypergraphs G € THYP into clause-sets F2(G) € 
CLS yields also a natural and interesting class of clause-sets. Bihitting clause- 
sets, introduced in 4, Subsection 4.2], are those F € CLS where F’,F” C F 
with F’ UF” = F, F’O F” = @ exist, such that for all C’ € F’,C” € F” holds 
C’n —C” # 0, while F’, F” itself are clash-free (i-e., (U F’) NM —(UF”’) = 0, and 
(UF”)N-(U F”) = 0). Obviously, the images under F» of intersecting hypergraphs 
are precisely the bihitting complementation-invariant PN-clause-sets (i.e., the set of 
bihitting clause-sets in the image of F2), and deciding their minimal unsatisfiability 
is thus another manifestation of the Hypergraph Transversal Problem (directly re- 
lated to the decision “G = Tr(G))?”). And another one is to decide SAT for general 
bihitting clause-sets (as can be easily seen, and is discussed in (Re Subsection 4.3]; 
directly related to the decision “Tr(G) = G’ ?”). 

In ba, Theorem 8.14] (the first 6 sections are covered by 64, 53}) the character- 
isation of oj (the intersecting G € MN Ce ca) is translated into this language. 


1.6.3 Autarkies for hypergraphs 


We discuss here now two autarky systems (recall Subsection for a general intro- 
duction), which are especially relevant for hypergraph colouring. 


Balanced autarkies. Balanced autarkies for F € CLS (introduced in bd; 3 
Subsection 11.11.4] provides an introduction) are partial assignments y, which in 
every clause of F’ they touch satisfy as well as falsify at least one literal (that is, 
for C € F with CN (py U-w) 4 @ holds Cn y 4 @ as well as CN-y £9). This is 
a normal autarky system, and thus we basically have all the good property general 
autarkies have. Balanced autarkies are closely related to hypergraph colouring. 
The balanced autarkies for F are precisely the autarkies of FU{—C: C € F}, and 
every autarky for a complementation-invariant clause-set is automatically balanced. 
A clause-set is balanced-satisfiable, i.e., can be satisfied by a balanced autarky, iff 
it is NAE-satisfiable (“not-all-equal” ; see for basic results). 

Balanced autarkies provide the general autarky form for PCLS (whose elements 
are all trivially satisfiable, and thus unrestricted autarkies are not of interest here), 
which represents hypergraphs for the 2-colouring problem: an F' € PCZLS is 2- 
colourable iff it is balanced-satisfiable, and F' is minimally non-2-colourable iff it 
is minimally balanced-unsatisfiable. Finally we have balanced lean clause-sets (i.e., 
having no non-trivial balanced autarkies), and this is the appropriate notion of 
“leanness” for hypergraphs, as represented by the class PCLS; more precisely, a 
hypergraph G is lean iff for an isomorphic F € PCLS (isomorphic as hypergraph) 
we have that F’ is balanced lean. For lean hypergraphs G we have dy(G) > 0, and 
this is properly treated by “balanced linear autarkies” . 


Balanced linear autarkies. The special case of “balanced linear autarkies” was 
introduced in 61 Section 6]; these are the simple linear autarkies for F'U{—C : 
C € F} (recall Subsection ) P Equivalently, the balanced linear autarkies y for 


8)More precisely one should speak of “balanced simple linear autarkies”, but for convenience 
“simple” is dropped. We note that “balanced linear autarkies” are balanced and linear autarkies, 
but in general a balanced and linear autarky need not be a balanced linear autarky, and one should 
speak of “balanced-linear autarkies”; again this is an abuse of language, motivated by the fact 
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F € CLS are obtained from solutions ¢ € R") of M(F) -@ = 0, by translating 
the values #; as discussed before (it is an easy exercise to see that this yields indeed 
balanced autarkies). We have a non-trivial balanced linear autarky iff M(F)-¢ =0 
has a non-trivial solution, and so, in other words, F is balanced linearly lean iff 
the columns of M(F) are linearly independent (iff rank(M(F')) = n(F)). Thus if 
F €CCZS is balanced linearly-lean, then 6(F') > 0 holds; furthermore, as shown in 
64, Lemma 7.2], there is then a matching in the clause-variable graph covering all 
variable nodes, and thus even 6*(F') = 6(£) holds. By noting that F’ € CLS is 
balanced linearly lean iff FU{—C : C € F} is linearly lean, and considering PCLS, 
we obtain that for lean hypergraphs G (especially, minimally non-2-colourable) we 
have du(G) > 0. To say the argument again explicitly: Consider a hypergraph 
G € PCLS; then G (as a clause-set) is balanced linearly lean iff the variable-clause 
matrix has linearly independent rows, iff F(G) is linearly lean (again, as a clause- 
set), which is implied by F2(G) being minimally unsatisfiable (or weaker, being 
lean), which in turn is equivalent to G (as a hypergraph) being minimally-non-2- 
colourable. This conclusion “The rows of the incidence matrix [our variable-clause 
matrix] of a minimally-non-2-colourable hypergraph are linearly independent over 
R.” is shown in pd); see [B2, Lemma 4.7] for this and related results, while the 
conclusion “dy(G) > 0” is discussed as Principle 2.1 in [B2). For properties of 
minimally balanced linearly unsatisfiable clause-sets see [56 Section 4]. 


Fundamental inequalities. We have yet seen two fundamental inequalities, 
namely 6(F’) > 1 for F € MLEAN, as first shown in [Bj for minimally unsatisfiable 
clause-sets, and 6(F) > 0 for balanced linearly lean clause-sets, first shown in po 
(as dy(G) > 0 for minimally non-2-colourable hypergraphs) /JAutarky theory shows 
the general structure of the arguments: We find “obstructions” , which prevent these 
bounds from holding, where such obstructions are given by a subset F’ C F' where 
there is a partial assignment y with y « F’ = T, while var(y) MN var(F” \ F) = 0. 
Now minimally unsatisfiable F’' do not have such F’, and thus the envisaged bound 
holds for them, and this is the argumentation in e.g. Bj. 

But one can go beyond this, exploiting autarky reduction. Note that y is pre- 
cisely an autarky, and furthermore possibly one of a special structure. If we just 
look at general autarkies, then we obtain the first generalisation, to lean clause-sets 
or balanced lean clause-sets (covering the hypergraph cases). However often, due 
to the special structure, these special autarkies can be found in polynomial time, 
and their application yields some F’ C F,, such that the bound holds for F’ (while 
for F € MU we just have F’ = F’). If we have even an “autarky system”, then F” 
is uniquely determined, that is, does not rely on the choice of the autarkies in the 
reduction process. The case of main importance for this report is 6(F’) > 1, where 
the autarkies are matching autarkies, and the reduced F” is the matching-lean ker- 
nel of F', while those F with F’ = F are precisely the F € MLEAN. On the 
other hand, for hypergraph colouring the fundamental fact is 6(F') > 0 for balanced 
linearly-lean clause-sets, where the autarkies are balanced linear autarkies, and the 
reduced F” is the balanced-linearly-lean kernel of F’. In fact, via autarky reduction 
we obtain a general method to study decompositions, which we will discuss in the 
context of “QMA”. 


1.6.4 Qualitative matrix analysis (QMA) 


QMA can be understood as the analysis of matrices M over the real numbers in 
abstraction of the absolute value of the entries, but only their signs count, that is, 


that linear autarkies which are also balanced are apparently too general a concept to be useful. 


7) An application yielding Fisher’s inequality (design theory) is discussed in Subsection 7.4 of 
(while Seymour’s inequality is discussed there in Subsection 8.2). 
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one considers the qualitative class Q(M), which consists of all matrices with the 
same dimensions as M, which have entry-wise the same signs as M (positive, zero, 
negative), and investigates when a property of M holds for all M’ € Q(M). For 
example, a matrix M, such that all M’ € Q(M) have linearly independent rows, is 
called an L-matriz. The monography [] is an excellent source on QMA until the 
1990’s, while a more recent overview is given in 

Starting from [L6|, which exploits Farkas’ lemma to understand (un)satisfiability, 
the connections to QMA have been first explored in Sections 3 and 5 Pau see 3} 
Subsection 11.12.1] for a more substantial introduction. It is shown in [51] Remark 5 
in Section 5], that Z-matrices correspond (nearly) precisely (up to transposition and 
handling of zero-rows/columns and repeated rows/columns) to balanced lean clause- 
sets, while lean clause-sets correspond (nearly) precisely to so-called L*-matrices 
(as investigated in (7d)). The square L-matrices are called SNS-matrices; SNS- 
matrices are at the heart of the poly-time decision for MNC? _o (recall Subsection 
fi.6.1), and the connections to autarky theory are explored in Ba: see 3} Subsection 
11.12.2] for an overview. 

Further in the translation of terms, now regarding unsatisfiability: unsatisfiable 
clause-sets correspond to sign-central matrices, minimally unsatisfiable clause-sets 
correspond to minimally sign-central matrices. So A Theorem 5.4.3] is yet another 
proof of 6(f) > 1 for F € MU. The variable-degree, as studied in the current 
report, corresponds to the number of non-zero entries in the rows of the matrices 
(while the deficiency is the difference of the number of columns and the number of 
rows). The elements of MU;_; correspond to S-matrices, the elements of SMUs5—1 
correspond to maximal S-matrices. 

As mentioned, autarky systems A (like balanced autarkies, matching autarkies, 
etc.) also yield a framework for decomposition theorems. The basic decomposition 
is into A-lean and A-satisfiable sub-clause-sets, as given in [5]j, Theorem 8.5] (for 
normal autarky systems), which corresponds to a certain unique decomposition of 
the clause-variable matrix into a triangular shape with two blocks on the diagonal, 
and generalises various matrix decompositions in QMA, as discussed in 61) Foot- 
note 7, Page 246]. A-lean clause-sets itself can be further decomposed, and the 
main result is bo Lemma 6] (reviewed in 3, Subsection 11.11.5]), generalising is 
Theorem 2.2.5], stating that a clause-set F € CLS is minimally A-unsatisfiable iff 
F is barely A-lean (it is lean, but removal of any single clause destroys this) and 
A-indecomposable (no triangular decomposition into A-lean blocks is possible for 
the clause-variable matrix). 


1.6.5 Biclique partitions of (multi-)graphs, and algebraic graph theory 


We finish this overview on related themes in combinatorics by a field of graph theory, 
which, like QMA, can be understood as a study of clause-sets from a special angle, 
focusing on the conflict-structure of clauses. 


Certain aspects of algebraic graph theory. The starting point is Bal, where 
the problem of “addressing a graph” was introduced. One considers a symmetric 
matrix D of dimension m € N over No, with a zero-diagonal, where the entries are 
interpreted as “distances” (in Ba the Dj; are the distances between the nodes of 
some graph), and asks for the smallest N € No, such that there are m codewords 
C1,---,€m € {0,1,*}*% with the property, that the modified Hamming distance 
between c; and c;, which simply ignores positions with +, is D;,;. 

From our point of view, such a codeword is nothing else than a clause over the 
variables 1,...,.N, while the modified Hamming distance is the number of clashes 
(conflicts). So the question is about the existence of clauses C; for i € {1,...,m} 
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over variables 1,...,.N, such that Dj; = |C;  —C;| for i,j € {1,...,m}. How- 
ever yet “clause-sets” are not perceived/known as combinatorial objects, and their 
perspective is missing from the literature. See Chapter 9 of pS, Chapter 9] for an 
introduction. A basic result of is that if D has all entries outside the diagonal 
equal to 1, then N = m — 1 (see also (B2} Lemma 6.6] for a discussion in the con- 
text of eigenvalue methods; for a direct combinatorial proof see Pd). This follows 
from the general result N > max(n+(D),n_(D)) of B3) (the “Lemma of Witsen- 
hausen” ), where n+(D) resp. n_(D) is the number of positive/negative eigenvalues 
of D. For the general case in it is shown, that if the distances D,;,; are indeed 
the distances between the nodes of some graph, then we have N <m-—1. 

Actually, the Lemma of Witsenhausen works for arbitrary matrices D over No 
with zero diagonal. Taking up the clause-set perspective again, the minimal number 
N of variables in a clause-set representing D (it is an easy exercise to see that N is 
finite, i.e., a representation is always possible) is equal to the minimal number of 0, 1- 
matrices of rank 1 which add up to D: A variable contributes precisely a “rectangle” , 
i.e., a matrix which is 1 at the entries J x J for some 0 4 I,J C {1,...,m}, and 
otherwise 0, and these are precisely the 0, 1-matrices of rank 1. Considering D as the 
adjacency matrix A = D of some multigraph (where parallel edges are allowed), we 
see that N is also equal to the minimum number of bicliques into which the edge-set 
of that multigraph can be partitioned, and N is therefore denoted by bep(A) € No 
(the “biclique partition number” of A resp. the corresponding multigraph) | 

The notion “hermitian rank” has been introduced and studied in Ba for arbi- 
trary hermitian matrices A (square matrices with complex numbers as entries, such 
that transposing the matrix and taking the complex conjugate of each entry yields 
back the original matrix), denoted by h(A) := max(n;(A),n_(A)) € No. So the 
Lemma of Witsenhausen takes the form, that for symmetric matrices A over No 
with zero diagonal holds bep(A) > h(A). 


Conflict analysis. The essential observation is now that we can go back and forth 
between biclique partitions of multigraphs and clause-sets. In one direction we can 
understand clause-sets F’ as representations of biclique partitions of multigraphs, 
where for each vertex we get a clause, and from each biclique we obtain a variable, 
where the two sides of the biclique are the positive and negative occurrences of the 
variable. So we can understand a multigraph together with a biclique partition as a 
clause-set, and we can use tools from clause-set-logic to analyse the pair multigraph 
with biclique-partition. The deficiency then becomes the difference between the 
number of nodes and the number of bicliques. Satisfiability means that it is possible 
to select from each biclique one side such that all vertices are covered. 

In the other direction we can understand biclique partitions of multigraphs (or, 
equivalently, representing a matrix A as above as a sum of rank-1 matrices over 
{0,1}) as representations of clause-sets F', namely the nodes of the conflict multi- 
graph cmg(F’) are given by the clauses, while the edges are the conflicts (clashing 
literal occurrences x, —2), and the bicliques are given by the variables (their positive 
and negative occurrences). In this way we can analyse the influence of the “conflict 
structure” on properties of clause-sets; the basic notions, as introduced in [54) with 
underlying report ba, are as follows. 

For F € CLS let CM(F’) (the conflict matrix) be the square matrix of dimension 
c(F) over No, with entries |CM—D] for C,D € F (thus with zero diagonal), i.e., 
CM(F) is the adjacency matrix of cmg(F’). So we can use the hermitian rank as a 
measure h : CLS — No (as first done in 4, Subsection 3.2]), namely 


h(F) := h(CM(F)); 


8)In we used “ns(A) instead, the “symmetric conflict number”. 


17 


see Points 1, 3 in Re, Section 2] for various equivalent characterisations )] By 
definition we have bcp(F’) := bep(CM(F)) < n(F), and thus h(F) < n(F). Since 
for a principal submatrix A’ of a hermitian matrix A holds h(A’) < h(A) (this 
follows by “interlacing”; see 4, Theorem 9.1.1]), we get h(y * F) < h(F) for all 
partial assignments y, and also h(F”) < h(F) for all F’ C F, which gives motivation 
to consider h(F') as a complexity measure for F € CLS. 

In [54] also the hermitian defect 5, : CLS —+ No has been introduced as 


dn(F’) = c(F’) — h(F), 


and thus 6(F’) < dn(£); see Point 2 in P¢, Section 2] for a geometric characterisation 
(as the “Witt index” of the quadratic form associated with CM(F)). Actually 
6*(F') < dy(F) holds and even stronger properties (see 4, Subsection 3.3]). An 
important property is (again) dn(y * F) < on(F) for all F € CLS and partial 
assignments y together with oy (F”) < on(F’) for F’ C F, by 64 Corollary 9], and so 
we might consider the hermitian defect as a stabilised version of the maximal defect 
(both are also complexity measures; recall that we have fixed-parameter tractable 
SAT decision for input F € CLS in the parameter 6*(F')). Note that in general 
we can have 6*(y * F) > 6*(F), for example F' := {{1}} has 6*(F) = 6(F) = 0, 
while for F’ := (1 — 0) we get F’ = {1}, and thus 6*(F’) = 6(F’) = 1. See 
fg, Subsection 3.3] and 57, Subsection 11.2] for more information on 0*(y * F); 
splitting on a single(!) variable is very important for this report, with the basic fact 
O*((a 41) * F) < 6(F) for F € MU and any literal z. 

The first direct applications applied 6(F') < oy,(F) for F € CLS, namely that 
for a hitting clause-set F € HIT (equivalently, all entries of CM(F’) outside the 
diagonal are non-zero) with a regular conflict multigraph (i.e., all entries of CM(F) 
outside the diagonal are equal) we have 6,(F’) < 1, and thus 6(F’) < 1 (64, Theorem 
33]) [>] We get that SMUsa1 = UHTT5=1 is (precisely) the class of unsatisfiable 
hitting clause-sets with regular conflict multigraph (64, Corollary 34]; a combinato- 
rial proof of this was independently found in pil Lemma 11]), and is also (precisely) 
the class of unsatisfiable clause-sets F with d,(F’) < 1 (54, Theorem 26]). 

A clause-set F € CLS is called exact (4, Subsection 3.4]) if bep(F’) = n(F), 
that is, F is optimal in realising cmg(F’) with respect to the number of variables. De- 
ciding exactness is coNP-complete, while the special class of eigensharp clause-sets, 
defined by h(F’) = n(F), or, equivalently, 6,(F’) = 6(F), is decidable in polynomial 
time. With fA Theorem 14] every eigensharp clause-set is matching lean. This 
leads to 64 Conjecture 16], “Every exact clause-set, whose conflict-matrix is the 
distance matrix of some connected graph, is matching lean.” , which generalises the 
already mentioned main result of (the proof of the “squashed cube conjecture” ). 

As already mentioned, we consider h(F’) for F € CLS as some form of complexity 
measure, measuring the complexity of representing the conflicts of F' via simple 
matrices. In exe polytime SAT decision in case h(F’) < 1 was shown, while the 
cases h(F’) < k for fixed k > 2 are open; an interesting stepping stone is to show 
polytime SAT decision for F € CLS with bep(F) < k (recall CLSpep<k C CLSn<x). 
The notion of blocked clauses, a special type of clauses which can be removed sat- 
equivalently, introduced in ba, is important here, and Ra, Theorem 3] shows, 
that from F € CLSp<1 after elimination of all blocked clauses (which yields a 
unique result) we obtain fF’ C F with bep(F’) < 1. We recall from Subsection 
fi.6.9, that SAT-decision for F’ is now a special case of the Transversal Hypergraph 
Problem, namely, as shown in 4, Lemma 11], the problem is exactly the Exact 
Transversal Hypergraph Problem, where every transversal must be “exact”, that is, 
must intersect every hyperedge in exactly one vertex; this problem is decidable in 


°) For Point 1(c) there it must be a “diagonal matrix A’”. 
10) Tn unfortunately the term “uniform” was (mis)used instead of “regular”. 
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polynomial time by Rij, and thus we get SAT-decision for CLS;,<1 in polynomial 
time. The characterisation of F € MU with bep(F’) < 1 is an open problem (while 
we have polytime membership decision for MU,-p<1), and by 6, Conjecture 16] 
they would have a very simple structure. 

We conclude by mentioning that in ba, Section 6] the basic facts are gener- 
alised to non-boolean clause-sets, and that by extending the reduction of multi- 
clique partitions to biclique partitions from B4 a new and interesting translation 
from non-boolean to boolean clause-sets was obtained. 


1.7 Overview on results 


Sections Pl to provide foundations for the main results in the later sections. In Sec- 
tion A basic notions and concepts regarding clause-sets and autarkies are reviewed. 
In Section Bl we discuss minimal unsatisfiability, with some auxiliary results on satu- 
ration (adding literal occurrences to clauses, to make minimal unsatisfiability robust 
against splitting) and splitting. Section a reviews the concept of “variable-minimal 
unsatisfiability” , as introduced in (aj, i.e., the class MU C VMU CUSAT. There 
are mistakes in this paper, and we rectify them here: 


° we show that VMU Cc LEAN holds (Lemma [L.3); 
e we provide a corrected characterisation of VMU (Lemma .5); 


e and we give a corrected proof of polytime decision of VMUs=» for fixed k 
in Theorem (4.71, where we also obtain the stronger result, that decision of 
VMuU (i.e., deciding whether for input F € CLS holds F € VMU or not) is 
fixed-parameter tractable in the deficiency 6(F). 


Section Bl is then concerned with singular variables, eliminating them via singular 
DP-reduction, and creating them via “singular extensions”. An important auxiliary 
result is Lemma showing that eliminating singular variables is harmless for 
bounds on the minimum variable-degree; we also show various auxiliary results 
on unit-clauses in minimally unsatisfiable clause-sets. This block of preparatory 
sections is concluded by Section 6] on full subsumption resolution, an ubiquitous 
reduction (and extension); as an application, in Theorem we can determine 
precisely the possible n(F’) and c(F') for F € MUsex. 

The first main results (but still on the preparation side) one finds in Section 
a which introduces the numbers nM(k) € N and proves exact formulas and sharp 
lower and upper bounds; the point here is that the introduction of nM(k) happens 
via a recursion which is tailor-made for our application in Section |8| but which 
makes it somewhat difficult_to determine the numbers in a global way. A main 
technical result is Theorem (7.15, while Theorem proves the general formula. 

In Section Rj then we find a basic central result of this report, the upper bound 
pvd(MUs=%) < nM(k) (Theorem 8.3). Section b| is concerned with generalising 
this upper bound. An interesting auxiliary class SED C CLS, clause-sets where 
deficiency and surplus coincide, is introduced in Subsection (9.1} the main lemma 
here is Lemma 0.5} which shows that unsatisfiable elements of SED are in fact in 
VMU. In Subsection the upper bound for MU then is lifted to lean clause-sets 
in Theorem P.8, and also sharpened via replacing the deficiency 6 by the surplus o. 
Theorem shows that the upper bound is sharp for any class between VMU M 
SED and LEAN. 

Section concerns algorithmic applications. A corollary of Theorem is, 
that if the asserted upper bound on the minimum variable degree is not fulfilled, 
then a non-trivial autarky must exist (Lemma (10.1). Since the variable-set of such 
a non-trivial autarky is polytime computable, we show in Theorem that we can 
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indeed establish the upper bound shown for lean clause-sets also for general clause- 
sets, after a polytime autarky-reduction. In Subsection then the problem of 
finding such autarky (that is, finding the assignment) is discussed, with Conjecture 
10.4 making precise our believe that one can find such autarkies efficiently. Theorem 
10.9 pinpoints the “critical” class MLCR C SAT, which is polytime decidable, 
and where we know that these clause-sets are satisfiable, but we even don’t know 
how to find any non-trivial autarky efficiently. This block on generalisations of 
the min-var-degree upper bound is concluded by Section where we discuss the 
possibilities to generalise it to matching-lean clause-sets (where only the absence of 
special (non-trivial) autarkies is guaranteed). 

In Section fi we then turn to the study of the numbers unM(k) := pvd(MUs=z), 
looking now for improved upper bounds and matching lower bounds. We present 
two infinite classes of deficiencies k with ynM(k) = nM(k), and present a general 
method of obtaining lower bounds for wznM(k), via counting full clauses (clauses 
containing all variables — these clause are strong structural anchors). In Section 
we introduce a general recursive method to obtain upper bounds like nM(k), via the 
“non-Mersenne operator” NM(f), which takes a “valid bounds function” f, that is, 
some partial information on ynM(k), and improves it (Definition [13.12). Theorem 
shows that this indeed yields a valid method for improving upper bounds on 
pnM(k), while in Theorem we demonstrate how this method recovers nM(k), 
by just starting with the information uwnM(1) = 2. In Section [14] we harvest (first) 
fruits of these methods. First in Theorem fi4.1] we show wnM(k) = nM(k) for k < 5. 
Then in Theorem we prove nM(6) = nM(k) — 1 (using a variety of structural 
results on MU provided in this report). Plugging this information on ynM into 
our machinery, we obtain the improved upper bound nM < nM, in Theorem |14.5, 
while in Theorem we determine nM,(k) numerically. 

Finally, in Section {15 open problems are stated, thoroughly discussing research 
perspectives, including nine conjectures. Subsection discusses improved up- 
per bounds for ynM(k) from the forthcoming work [68]. Subsection is about 
improved lower bounds, via counting full clauses. In Lemma we cite from 
the work in progress (64) (to be completed soon), which provides improved lower 
bounds via the “Smarandache primitive function” S2(k), yielding the first-order 
asymptotic determination of ynM(k) ~ & (Corollary =r) where now the open 
question is about the asymptotic determination of unM(k) — k. In Subsection [15.4] 
we discuss generalisations to non-boolean clause-sets. 

The central Conjecture[15.4 of the project of “understanding MU”, on the finitely 
many “characteristic patterns” for each MUs~,, is discussed in Subsection (15.5) 
An important special case is Conjecture (now a fully precise statement), about 
the classification of unsatisfiable hitting clause-sets (or “disjoint/orthogonal tau- 
tologies” in the terminology of DNFs). In Lemma we show how two of the 
conjectures together yield computability of nM(k). 

This report is a substantial extension of the conference paper [62]: Section 3 
there has been extended to Section ai here, with considerable more details and 
examples on non-Mersenne numbers. Section 4 there is covered by Sections ig, p| 
and with various additional results (for example showing sharpness of the upper 
bound for LEAN). And the results for Section 5 there are contained in Subsection 
here. All other sections in this report are new. 


2 Preliminaries 
We follow the general notations and definitions as outlined in [43], where also further 


background on autarkies and minimal unsatisfiability can be found. We use N = 
{1,2,...} and No = NU {0}. For the binary logarithm we use ld(z) := logs(z) 
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(“logarithm dualis” ). 

We apply standard set-theoretic concepts, like that of a map as a set of pairs, 
and standard set-theoretic notations, like f(S) = {f(x) : « € S} for maps f and 
S C dom(f), and “c” for the strict subset-relation. We use also the less-known 
notation “AW B” for union in case A, B are disjoint, that is, AW B := AUB is only 
defined for ANB = @. For maps f,g with the same domain X we use f < g:@VareE 
X: f(x) < g(x) (ie., pointwise comparison), while f< g:sVaeEX: f(x) < g(a). 


2.1 Clause-sets 


The basic structure is a set CLIT, the elements called “literals”, together with a 
fixed-point free involution called “complementation”, written x € LIT + FE LIT; 
so the laws are  # x and =z for x,y € LIT. We assume Z \ {0} C LIT, with 
z= —x for rz € Z\ {0}. For a set L of literals we define L := {%: x € L}. 
Furthermore a set N C VA C LTT, the elements called “variables”, is given, with 
LIT = VAWVA. Variables are also called “positive literals”, while literals 0 for 
v € VAare called “negative literals”. The “underlying variable” of a literal is given 
by the operation var : LIT — VA (“forgetting complementation” ), with var(v) := v 
and var(¥) := v for v € VA. 


Example 2.1 We can thus write e.g. 1,6 for two (different) variables, and 1,5, —1 
for three (different) literals. In ecamples we will also use v,w, a,b,c and such letters 
for variables (as it is customary), and accordingly 0 etc. for literals, and in this 
context (only) it is then understood that these variables are pairwise different. So 
{v,w,x,X}, when given in an example (without further specification), denotes a set 
of literals with |{v,w,x,@}| = 4 and |{v,w,2,Z}N VA =3. 

Without restriction we could assume LIT = Z\ {0} (as we did in the Introduc- 
tion), but it is often convenient to use arbitrary mathematical objects as variables. 
All our objects build from literals are finite, and thus, because of the infinite supply 
of variables, there will always be “new variables” (that’s the mathematical point of 
having natural numbers as variables — we won't use the arithmetical structure). 


A clause C is a finite and clash-free set of literals (i.e., COC = 0), the set of all 
clauses is CL. A clause-set is a finite set of clauses, the set of all clause-sets is CLS. 
The simplest clause is the empty clause L :=@ € CL, the simplest clause-set is the 
empty clause-set T := (0 € CLS. The set of all hitting clause-sets is denoted by 
HIT Cc CLS, those F € CLS such that two different clauses C,D € F, C 4 D, 
have at least one clash, i.e., CTD 4 0. In the language of DNF, hitting clause-sets 
are known as “orthogonal” or “disjoint” DNF’s; see {[L3, Chapter 7]. 


Example 2.2 We have e.g. {1,2,-3} € CL, while {—1,1} € CL. The only clause- 
set in HIT containing the empty clause is {L} € HIT. An example of a non-hitting 
clause-set is {{1,2},{-1,2}, {8}} © CLS \ HIT, where we obtain an element of 
HIT if we add literal —2 to the third clause. 


We use var(F’) := Ucep var(C) for the set of variables of F € CLS, where 
var(C’) := {var(x) : x € C} is the set of variables of clause C_€ CL. The possible 
literals for a clause-set F' are given by lit(F') := var(F’) U var(F’), while the actually 
occurring literals are just given by LJ F’ (the union of the clauses of F’). A literal « 
is pure for F if ¢ UF. For a clause-set F' we use the following measurements: 


e n(F’) := |var(F’)| € No is the number of variables, 


e c(F’) := |F| € No is the number of clauses, 
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e O(F) := c(F) — n(F) © Z is the deficiency (the difference of the number of 
clauses and the number of variables), 


e &(F) := icerlC| € No is the number of literal occurrences. 


We call a clause C' full for a clause-set F' if C € F and var(C) = var(F’), while 
a clause-set F' is called full if every clause is full. For a finite set V of variables let 


A(V) := {C €CL: var(C) = V} € CLS. 


Obviously A(V) € HIT is the set of all 2!! full clauses over V, and F € CLS is 
full iff F C A(var(F’)). We use Ay, := A({1,...,n}) for n € No. Dually, a variable 
v € VA is called full for a clause-set F' if for all C € F holds v € var(C). A 
clause-set is full iff every v € var(F) is full. 


Example 2.3 For F := {1, {1}, {-1,2}} we have: 
1. var(F) = {1,2}, lit(F) = {-1,1, -2,2}, U F = {-1,1,2}. 
2. Literal 2 is pure for F (the other literals in lit(F’) are not pure). 
3. n(F) = 2, e(F) = 3, 6(F) =1, &(F) =3. 
4. {—1,2} is a full clause of F', while the two other clauses are not full. 
5. F has no full variable, while F \ {L} has the (single) full variable 1. 
The standard “complete” full clause-sets are Ag = {L}, Ai = {{-1}, {1}}, 


Ag = {1 =O {=}; {1, = {1,2}}, 
and so on. 


We often define a class of clause-sets via some measure js as follows: 


Definition 2.4 Consider a class C C CLS and a measure uw: CLS 4 R. Fora eR 
we use Cuma := {F €C: u(F) = a}, and similarly we use Cuca and analogous 
notations. 


When we use the form “C,,o,”, then j stands for a measure (e.g., 4 = 6 or w= 7). 


Example 2.5 CLSnao = CLS =o = {T, {L}}, CLS ca0 = {T}, and CLSnep = 9%. 


2.2 Semantics 


A partial assignment is a map y : V — {0,1} for some finite (possibly empty) 
set V Cc VA of variables, where var(y) := V and lit(w) := var(y) U var(y). The set 
of all partial assignments is denoted by RASS. For a literal x € lit(w) we also define 
p(x) € {0,1}, via y(v) := 1 — y(v) for v € var(y). Via a small abuse of language 
we define y1(ce) := {x € lit(y) : v(x) = e} € CL for e € {0,1}. Special partial 
assignments are the empty partial assignment () := @, and for literals x € CLIT and 
é € {0,1} the partial assignment (a — ¢) € RASS, with var((z — ¢)) = {var(«)} 
and (a > €)(x) =e. 

The application of a partial assignment y € RASS to a clause-set F’ is denoted 
by » * F, which yields the clause-set obtained from F' by removing all satisfied 
clauses (which have at least one literal set to 1), and removing all falsified literals 
from the remaining clauses: 


px F :={C\o'(0):CEFACNg 11) =} €CLS. 
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This definition is motivated by the default interpretation of a clause-set as a “con- 
junctive normal form” (CNF), where a clause is understood as a disjunction of 
literals (thus is satisfied iff at least one literal in it is satisfied), while a clause- 
set is understood as a conjunction of its clauses (thus is satisfied iff all clauses 
are satisfied). A clause-set F is satisfiable iff there is a partial assignment y with 
p*xF =T, otherwise F is unsatisfiable. The set of satisfiable clause-sets is denoted 
by SAT Cc CLS, while USAT := CLS \ SAT denotes the set of all unsatisfiable 
clause-sets. 


Example 2.6 If F ¢ USAT and for F’ € CLS holds F C F", then also F’ € 
USAT (satisfying a clause-sets gets harder the more clauses there are). 

By definition we have p* F=T iffY¥DEF:o 11)ND FO; thus F € SAT 
iff there is a clause C € CL with CO D # QO for all D € F. (We could write 
COD#L” here, but it appears somewhat more natural to use “)” here.) 


The unsatisfiable hitting clause-sets are denoted by UVHIT := USAT NHIT. 


Example 2.7 T€ SATNHIT and {L} ©UHTT. In general a full clause-set F 
is unsatisfiable iff F = A(var(F’))), and thus A(V) € UHTT for all finite V C VA. 

The fundamental property for F € HIT: Consider y,w € RASS, such that there 
areC, DE F,C#D, with px {C}=wx« {D} = {1} (that is, Le yx F yx F, 
where there are different falsified clauses for these two partial assignments). Then 
yp, w are incompatible, i.e., there is v € var(y) N var(wW) with p(v) 4 w(v). 

It follows easily that for F € HIT holds F © USAT me Q-l¢l = 1, 

A nice exercise is to show UHLT5<o = 9 (in Section a more general result 
is stated). 


Finally, the semantical implication F & C for F € CLS and clauses C € CL 
holds iff Vy € PASS: px F=T > px{Ch=T. We have FEUSAT SF EL. 


2.3. Resolution 


Two clauses C, D € CL are resolvable if |CD]| = 1, i.e., they clash in exactly one 
variable (called the resolution variable var(x), while x is called the resolution literal). 
For two resolvable clauses C and D, the resolvent CoD := (CUD) \ {z,Z} €CL 
for CN D = {x} is the union of the two clauses minus the resolution literal and its 
complement. As it is well-known (the earliest source is &. (7). a clause-set F’ is 
unsatisfiable iff via resolution (i.e., closing F’ under addition of resolvents) we can 
derive L, and, more generally, we have F — C iff from F' via resolution a clause 
C’ C C is derivable. 

An important reduction for clause-sets F’ € CLS and variables v € VA, resulting 
in a clause-set satisfiability-equivalent to F’ (satisfiable iff F is; sometimes called 
“equi-satisfiable” ) and with variable v eliminated, is DP-reduction 


DP, (F) :={CeéF:v¢ var(C)} U{CoD:C,DEFACND= {v}} €CLS 


(also called “variable elimination”), obtained from F’ by removing all clauses con- 
taining variable v from F’, and replacing them by their resolvents on v. See for a 
fundamental study of DP-reduction. The satisfying assignments y of DP, (F) (ie., 
p* DP, (F’) = T) with var(y) = var(F’) \ {v} are precisely the satisfying assignments 
yp of F with var(y) = var(F’), when restricting y to var(F’) \ {v}. Logically, DP, (F’) 
is equivalent to dv: F, the existential quantification of v for F' (but we do not use 
quantifiers in this report, so this remark might be ignored here). 
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2.4 Multi-clause-sets and restrictions 


These notions are generalised to multi-clause-sets, which are maps F': CL > No, 
such that the underlying set of clauses {C € CL: F(C) F 0} is finite, and so we 
speak of the underlying clause-set; the set of all multi-clause-sets is denoted by 
CLe 1 P EN C0 Fi) is finite} fF] Clause-sets are implicitly promoted 
to multi-clause-sets, if needed, by using their characteristic functions, and multi- 
clause-sets are implicitly cast down, if needed, to clause-sets by considering the 
underlying clause-set; “if needed” refers to operations which either require multi- 
clause-sets or clause-sets. If however we want to make explicit these operations, we 
use cls : CLS + CLS (with cls(F) := CL \ F—1(0)) and cls : CLS > CLS (with 
cls(F’)(C) := 1 if C € F, and cls(F’)(C) := 0 otherwise). For F € CLS we extend 
the basic operations in the obvious way: 


e var(F’) := var(cls(F’)), lit(F) := lit(cls(F)), UF := Ucls(F). 


e n(F) := n(cls(F)) € No, c(F) = Vicece F(C) € No, 6(F) := e(F) — n(F) € 
Z, &F) = Voer F(C)- |C| € No. 


The application of partial assignments y € PASS to a multi-clause-set F € CLS 
yields a multi-clause-set y * F € CLS, where the multiplicity of a clause C € CL in 
yp * F is the sum of all multiplicities of clauses D € F (i.e., D € cls(F’)) which are 
shortened to C' by ¢: 


(p* F)(C) = + F(D). 
DEF, Dng-}(1)=0, D\p-1(0)=C 


Example 2.8 If is a total assignment for F' (assigns all variables of F, that is, 
var(y) = var(F’)), then p* F is {mx L}, denoting the multiplicity of a clause by a 
(formal) factor, with m = dicer, cny-1(1)=0 F(C) € No (so m=0 px F=T). 


For us the clause-sets are the objects of interests, while multi-clause-sets are only 
auxiliary devices, created by the operation of “restriction” defined next (Definition 

). However we have to take care of the details, and thus together with introducing 
a class C C CLS we also introduce the corresponding class C C CLS of multi-clause- 
sets (using the generalised definition of C), where we must discuss the relation. To 
start with, the classes SAT and USAT are invariant under multiplicities, that 
is, a multi-clause-set is in it iff the underlying clause-set is in the underlying class of 
clause-sets (SAT resp. USAT). The other extreme we have with the class HZT of 
multi-hitting-clause-sets, which disallows multiplicities, that is, all multiplicities 
must be 1 (since clauses can not clash with themselves, by definition of clauses), 
and thus up to the canonical identification the classes HZ7 and HZT are identical. 

An important operation with multi-clause-set is the “restriction” to a set of 
variables (see Subsection 3.5 in for more information): 


Definition 2.9 For a set V C VA of variables and a multi-clause-set F € CLS by 
F[V] € CLS the restriction of F to V is denoted, which is the multi-clause-set 
obtained by removing clauses from F which have no variables in common with V, 
and removing from the remaining clauses all literals where the underlying variable 


is not inV: 
FIV|(C) := S- F(D). 
DEF, var(D)NV 40, DA(V UV)=C 


Here it is essential that F'[V] is a multi-clause-set, and if previously unequal clauses 
become equal, then accordingly the multiplicity is increased. 


11)In earlier papers we used “MCLS” instead of “CLS”. 
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Example 2.10 {{a}, {a,b}, {b}, {7 }}[{a}] = {2 * {a}, {a}}- 
Simple properties of this operation are (for multi-clause-sets F and V,V’ C VA): 


1. F[O]) = 7, F[V] = F \ {1} for var(F) C V (where F' \ F” for a clause-set F’ 
means that all occurrences of clauses from F’ are removed from F’). 


2. (FIV])[V’] = FIVAV’). 


Clause-sets F',G are called isomorphic, if the variables of F can be renamed 
and potentially flipped so that F’ is turned into G. More precisely, an isomorphism 
a from F to G is a bijection a: lit(F’) > lit(G) which preservers complementation 
(a(¥) = a(a)), and which maps the clauses of F precisely to the clauses of G; when 
considering multi-clause-sets, then the isomorphism must preserve the multiplicity 
of clauses (that is, G(a(C)) = F(C) for all C € CL). 


2.5 Degrees 


For the number of occurrences of a literal x € LTT in a (multi-)clause-set F € CLS 
we write 


Idp(x):= S> F(C) END, 
CEeF,xEC 


called the literal-degree, while the variable-degree of a variable v is defined as 
vdr(v) := ldr(v)+ldr(v) € No. A (multi-)clause-set F' is called variable-regular 
if all variables v € var(F’) have the same degree, or, stronger, literal-regular, if all 
literals x € lit(F’) have the same degree. A singular variable in a (multi-)clause- 
set F is a variable occurring in one sign only once (ie., 1 € {ldr(v),ldr()}). A 
(multi-)clause-set is called non-singular if it does not have singular variables. The 
central concept for this report is the degree of a variable with minimal occurrences: 


Definition 2.11 We define the minimum variable degree uvd : CLS > NU 
{+00} (“min-var-degree” for short) as follows (which also works for multi-clause- 
sets F’): 


e For F €CLS with n(F) #0 we let pvd(F) := minyevar(r) Vdr(v) EN. 
e While for n(F’) = 0 we set uvd(F’) := +00. 


For a class C C CLS of clause-sets let pvd(C) € No U {+co} be the supremum of 
pvd(F) for F €C with n(F) > 0, where we set uvd(C) := 0 if there is no such F 
(while otherwise we have juvd(C) > 1). 


By definition we have pvd(C) < pvd(C’) for C CC’ CCLS. 


Example 2.12 For F := {2 * {a,b}, {a,b}, {b, c}} € CLS we have: 
e Idp(a) = 2, ldp(@) = 1, ldr(b) = 3, ldr(b) = 1, ldr(c) = 1, ldr(@) =0. 
e vdp(a) = 3, vdp(b) = 4, vde(c) = 1. 
e pvd(F) = 1. 
Examples for pvd(C) are: 
° pvd(0) =0. 
e pvd(CLS) = +00. 
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© pvd({T, {1}, {{v}, {abt Ft) = 2. 


The simplest but relevant class of clause-sets for us is given by the A(V) (the 
unsatisfiable full clause-sets; these are the simplest unsatisfiable clause-sets): 


Lemma 2.13 Forn € No we have 
1. n(A,) =n, C(An) = 2”, 6(An) = 2" — 10. 
2. Ay is full and unsatisfiable, and thus An € UHTTs=2n_n.- 
3. An is literal-regular (thus variable-regular), with uvd(A,) = 2”. 


Further properties of unsatisfiable full clause-sets one finds in Example}2.20, Lemma 


B., Corollary Lemmas 6.9, (6.10, Corollaries 6.11], 6.1, and Examples 9.2, (9.10, 


Properties of satisfiable full clause-sets are found in Example 


2.6 Autarkies 


Besides algorithmic considerations, which were always present since the introduction 
of the notion of an “autarky” in [79], also a kind of a “combinatorial SAT theory” 
has been developed around this notion of generalised satisfying assignments. A 
general overview is given in 3h, with recent additions and generalisations in 67. 
An autarky (see Section 11.8]) for a clause-set F' € CLS is a partial assignment 
yp € PASS which satisfies every clause C € F it touches, ie., for all C € F 
with var(y) N var(C) 4 @ holds y * {C} = T; equivalently, for all C € F holds 
Cony 1(0) 40+ Cny (1) 40. The simplest examples for autarkies are as 
follows: 


Example 2.14 The empty partial assignment () is an autarky for every F € CLS 
(no clause is touched), and more generally all p € RASS with var(y) Nvar(F’) = 0 
are autarkies for F’', the trivial autarkies. On the other end of the spectrum every 
satisfying assignment for F (i.e., p* F = T) is an autarky for F (every clause is 
satisfied). A literal x € LIT is a pure literal for F iff (a > 1) is an autarky for F. 


If vy is an autarky for F’, then yx F C F holds, and thus » x F is satisfiability- 
equivalent to F’. Autarkies mark redundancies, and the corresponding notion of 
clause-sets without such redundancies was introduced in bo), namely a clause-set 
F is lean if there is no non-trivial autarky for F’', and the set of all lean clause-sets 
is denoted by LEAN CUSAT U{T}. The class LEAN of lean multi-clause-sets is 
invariant under multiplicities. 


Example 2.15 Some simple examples: 


1. T,{L}, {{u}, {Ob}, {uy}, (oh, fw}, {wht € LEAN. 
2. If F,F’ € LEAN, then FUF' € LEAN. 
3. If FE LEAN and F' € CLS with var(F’) C var(F), then FU F’ € LEAN. 


4. {{v}, {d}, {w}} ¢ LEAN. 


A weakening is the notion of a matching-lean clause-set F' (introduced in 
[51], Section 7]; see (3h Section 11.11] for an overview), which has no non-trivial 
matching autarky, which are special autarkies given by a matching condition: 
for every clause touched, a satisfied literal with unique underlying variable must 
be selectable; the class of all matching-lean clause-sets is denoted by MLEAN D 
LEAN. 
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Example 2.16 A clause-set F € CLS is matching-lean (F € MLEAN ) iff for all 
F’ C F holds d(F’) < d(F) (4, Theorem 7.5]). Thus for every matching-lean 
multi-clause-set F # T we have 0(F') >1 ([54), generalising (9). 

F := {{1,3}, {2, —3}, {3}, {-3}} has the matching autarky (1 > 1,2 > 1), while 
for F’ := FU {{1,2}} we have F’ € MLEAN. Note 6(F) = 1 = d({{3}, {-3}}), 
while 6(F’) = 2. 


It is decidable in polynomial time whether F € MLEAN holds (which follows 
for example by the characterisation of MLEAN via the surplus below). The class 
MLEAN D LEAN of matching-lean multi-clause-sets is not invariant under mul- 
tiplicities: For a multi-clause-set F holds cls(F) € MLEAN => F € MLEAN, but 
not the other way around: 


Example 2.17 {{u}} € MLEAN, but {2 * {v}} © MLEAN, and more generally 
{{x1,...,an}} € MLEAN for n> 1, while {(n +1) * {x1,...,2n}} © MLEAN. 
Indeed it is easy to see that for every F € CLS there is F’ € CLS with cls(F") = F 
and F’ € MLEAN . 


The process of applying autarkies as long as possible to a clause-set F' € CLS is 
confluent, yielding the lean kernel of F (the largest lean sub-clause-set of F’, that 
is, U{F’ CF: F’ € LEAN}; see bol Section 3]). Computation of the lean kernel 
is NP-hard, since the lean kernel of satisfiable clause-sets is T. But the matching- 
lean kernel of F (the largest matching-lean sub-clause-set of F’, that is, U{F” C 
F: F’ € MLEAN}; see 1, Section 3]), now obtained by applying matching 
autarkies as long as possible, which again is a confluent process, is computable 
in polynomial time. Note that a clause-set F' is lean resp. matching lean iff the 
lean resp. matching-lean kernel is F' itself. Due to the polytime computability of 
the matching-lean kernel, which is a sub-clause-set obtained by removing clauses 
redundant in a strong sense, “w.l.o.g.” for the purpose of SAT-decision one might 
consider the inputs as matching-lean. 


Example 2.18 For inputs F © MLEAN by fod, Theorem 4] we have SAT-decision 
in time O(2°) . n(F)3) (see for generalisations), and thus SAT-decision for 
inputs F € MLEAN is fixed-parameter tractability (fpt) in the parameter 6(F). 
We note here (though we won’t use it in this report), that for inputs F € 
MLEANs=~ the computation of the lean kernel can be done in polynomial time 
for fixed k (54, Theorem 10.3]; this computational problem appears not to be fpt). 


A multi-clause-set F' # T is matching lean iff for the surplus we have o(F’) > 1 
(bi, Lemma 7.7]), which is defined as follows (see Subsection 11.1 in bz for more 
information; in a clause-set has “q-expansion” iff o(F') > q): 


Definition 2.19 For a multi-clause-set F let o(F’) € Z be defined as the minimum 
of (F[V]) (recall Definition 2.9) over all 0) # V C var(F) if n(F) > 0, while 
o(F) :=0 in case of n(F’) = 0. 


Note that for # 4 V C var(F’) we have 


FV) =cFV)-WVl= YD FO)-IWI. 
CEF, var(C)NV AO 


The surplus is computable in polynomial time. Some basic properties of the surplus 
are (for multi-clause-sets F’): 


L. o(F) < 6(Flvar(F)]) = 6(F \ {1}) < e(F). 
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2. For every @ Cc V C var(F) holds o(F[V]) > o(F). 
3. For F’ < F with var(F’) = var(F’) we have o(F") < o(F). 
4. —n(F) < o(F), and for n(F) > 0 holds 1 — n(F) < o(F) < c(F) - 1. 


Example 2.20 o(Ay) = 0, while o0(A,) = 2" —n = 5(A,) for n EN (since every 
variable occurs in every clause). If we take F € CLS and some v € VA \ var(F), 
then o(F u{{v}}) < 0. 


3 Minimally unsatisfiable clause-sets 


In this section we review minimally unsatisfiable clause-sets; see ie for an overview, 
while 63, contain recent developments. First the basic definitions and examples 
are given in Subsection B.1} In Subsection we consider in some detail the funda- 
mental process of “saturation”, which is about adding “missing literal occurrences” 
to minimally unsatisfiable clause-sets. Saturation repairs the problem that splitting 
of F € MU into (v > 0) x F and (v > 1) * F may destroy minimal unsatisfiable, 
ie, (v3 0) F € MU or (v — 1) * F € MU might hold, due to some clauses 
missed to be deleted, and this process is considered in Subsection B.3. 


3.1 MU and subclasses 


An unsatisfiable clause-set F' is called minimally unsatisfiable, if for every clause 
C € F the clause-set F’ \ {C} is satisfiable, and the set of minimally unsatisfiable 
clause-sets is denoted by MU Cc USAT. A clause-set F € MU is called sat- 
urated, if replacing any C' € F by any super-clause C’ > C yields a satisfiable 
clause-set, and the set of saturated minimally unsatisfiable clause-sets is denoted 
by SMU Cc MU. 


Example 3.1 The simplest element of USAT \ MU is {1, {1}}, while the simplest 
element of MU\SMU is {{1, 2}, {-1}, {-2}} (see Example |3.9 for a “saturation”). 


Unsatisfiable hitting clause-sets fulfil VHZT C SMU (see (64, Lemma 2] for the 
proof). The subsets of non-singular elements (i.e., there is no literal occurring only 
once) are denoted by MU’ Cc MU, SMU’ C SMU, and UHIT’ C UHIT. 


Example 3.2 By holds MUj_, = SMUj_, = UHTTz_, = {{L}}, while for 
the characterisation of MUs—, D SMUs=1 = UHTT5=1 see also Ee 44). As shown 
in fy, for Fe Musa, with n(F) > 0 holds pvd(F) = 2. 


We consider the “reasons” for unsatisfiability as given by the elements of MU; =1 
as “noise”, only “masking” the pure contradiction of the only element of MU;_, = 
{{L}} (in Section 5] the elimination of singular variables will be discussed). “Real 
reasoning” starts with deficiency 2: 


Example 3.3 By i, the elements of MU;_» are up to isomorphism precisely the 
Fy, for n > 2, where 


Fr = {{1,...,n}, {-1,..., —n}, {-1, 2},...,{-(n — 1), n}, {—n, 1}}. 


All Fp are literal-regular, with uvd(F,) = 4. It is easy to see that all Fp are 
saturated, and thus MUj_. = SMUj_». The only hitting clause-sets amongst the 
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Fy, are for n = 2,3, and thus up to isomorphism the elements of UHTTj_» are 
F2,F3, with F2 = Ag and 


Fz = {{1, 2,3}, {-1, —2, —3}, {-1, 2}, {-2, 3}, {-3, 1}}. 


We have o(F pn) = 2 = 0(F pn), since anym <n variables occur at least inm different 
binary ee a in the two “long clauses”. Further properties of the Fn, we have 


in Examples \3.15 (6.4, [s.4, (9.4, See Section 7 in [65] for more information. 


As shown in (65, Theorem 74], for every F € MUs 2 there is a unique n > 2 such 
that F, “embeds” into F’, and this n is called the “non-singularity type” of F. 
So for MUs—2 we have identified the (in a sense) unique reason of unsatisfiability, 
the (possibly hidden) presence of a cycle v1 > ... > Un — v1 together with the 
assertions, that one v; must be true and one must be false (only the n is unique in 
general, not the v;). We will come back to the theme of classifying MUs;=, in the 
Conclusion, Subsection (15.5) 

By definition, MU disallows multiplicities (since a duplicated clause is the trivial 
logical redundancy), and this also holds for the subclasses SMU and UHZT (as well 
as for all other subclasses of MU considered here). A fundamental fact is 6(F’) > 1 
for all F € MU (note that every minimally unsatisfiable clause-set is lean), which 
motivates the investigation of the layers MUs21, MUs=2,.... Special elements of 
UHTT are the A(V) for finite sets V of variables (recall Lemma B.13), which are 
the minimally unsatisfiable clause-sets with maximal deficiency for a given number 
of variables, as we will see in Corollary 6.11]. 


3.2 Saturation 


We recall the fact (b4 62) that every minimally unsatisfiable clause-set F € MU 
can be saturated, i.e., by adding literal occurrences to F we obtain F’ € SMU 
with var(£”) = var(F’) such that there is a bijection a: F > F’ with C C a(C) for 
all C € F. Since we need to consider saturation in many situations, we introduce 
some special notations for it from (Subsection 2.2). First we introduce the 
notation S(F, C,x) for adding a literal x to a clause C in a clause-set F’: 


Definition 3.4 ((65]) The operation (adding literal x to clause C in F') 
S(F,C, a) := (F \ {C}) U (Cu{z}) € CLS 
is defined if F €CLS, C € F, and x is a literal with var(a) € var(F) \ var(C). 
Some technical remarks: 
1. var(S(P, C,x)) = var(F). 
2. If CU{a} ¢ F, then c(S(F,C,x)) = c(F), and thus also 6(S(F, C,x)) = 6(F). 
3. For F € MU we have: 


(a) S(F,C,x2) € MU iff S(F,C, x) is unsatisfiable (since all what happened 
is that a clause has been weakened, i.e., extended). 

(b) If S(F,\C, 2) € MU, then c(S(F, C,x)) = c(F’) (no subsumption here). 

(c) F is saturated iff there are no C,x such that S(F,C,2) © USAT. 


Example 3.5 For F := {{a, b}, {a}, {b}} € MU\ SMU we have S(F, {a}, b) = 
{{a, b}, {G, b}, {by} E SMU. 
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A “saturation” of a minimally unsatisfiable clause-set is obtained by adding 
literals to clauses as long as possible while maintaining unsatisfiability (which is the 
same as maintaining minimal unsatisfiability): 


Definition 3.6 ((65]) A saturation F’ € SMU of F € MU is obtained by a 
saturation sequence F = Fo,..., Fm =F’, m € No, such that 


(i) for0O <i<m there are Ci, a; with Fy41 = S(Fi, Ci, xi), 
(ii) for all 1 <i<m we have F; USAT, 
(iii) the sequence cannot be extended (without violating conditions (i) or (ii)). 


Note that n(F’) = n(F) and c(F") = c(F) holds (and thus 6(F") = 6(F)). If we 
drop requirement (ii), then we speak of a partial saturation sequence, while 
F' € MU is a partial saturation of F « MU. 


Some technical remarks: 
1. F’ € MU is a partial saturation of F ¢ MU iff 
e var(F"’) = var(F) 
e there is a bijection a: F — F” such that for all C € F we have C C a(C). 


2. For a partial saturation sequence Fo,..., Fim we have ¢(Fiy,) = (Fo) +m. 


3. F’ is a saturation of F € MU iff F’ is a partial saturation of F with F’ € 
SMU. 


Example 3.7 A saturation sequence for F := {{a,b,c}, {a}, {b}, {e}} with m = 3 
is obtained by adding literals b,c to clause {a}, and adding literal c to clause {b}. 


We can perform a partial saturation F ~ S(F,C,x) iff F without C implies 
(logically) CwU{Z%} (note that Cu{x}, CWU{Z} implies C): 


Lemma 3.8 Consider F © MU, C € F, and a literal x with var(x) € var(F’) \ 
var(C). Then S(F,C, x) is a partial saturation of F if and only if F\{C} — Cu{z}. 


Proof: First assume that S(F,C,2x) is a partial saturation of F, but F\ {C} 4 
Cwu{z}. So there is a partial assignment y with yx(F\{C}) = T but px{Cu{z}} = 
{L} (whence y(#) = 1). But then we have y x S(F,C,x) = T. Reversely assume 
F\{C} §— Cw{Z}, but that S(F,C, x) is not a partial saturation of F. So S(F,C, 2) 
has a satisfying assignment vy; due to F € USAT we have v(x) = 1 and yx {C} = 
{L}. But this yields y x (F \ {C}) =T and yx {CU{Z}} = {L}. 


See Lemma 6.5, Part 6, for another characterisation of partial saturations. The 
dual notion of “saturated” is “marginal”: F € MU is marginal iff replacing any 
clause by a strict subclause yields a clause-set not in MU. The decision “F' marginal 
minimally unsatisfiable ?” for inputs F € CLS is D?-complete (5) Theorem 2]). 
By 4 Theorem 8] however this decision is easy for inputs F € SMU, namely 
there is the following characterisation of minimally unsatisfiable clause-sets which 
are marginal and saturated at the same time: 


Lemma 3.9 ((44]) F € MU is marginal and saturated iff F = A(var(F)). 


30 


Thus F € SMU is marginal iff F = A(var(F)); so if F ¢ SMU is not full, 
then there is a literal occurrence which can be removed without destroying minimal 
unsatisfiability, that is, there exists a clause C € F and x € C such that F’ := 
(F\ {C}) u{C \ {a}} © MU (note that F’ € USAT in any case, but minimality in 
general is not maintained). And for inputs F € SMU the existence of such C, x is 
decidable in linear time (namely they exist iff F is not full). But finding such C, x 
should be hard in general, that is, the decision problem, whether a concrete literal 
can be removed, even for inputs F € SMU should be NP-complete: 


Question 3.10 Is the promise problem for input F © SMU, C € F, « € C, 
whether “EF \ {C})U{C \ {a}} € MU ?”, NP-complete? (That is, is there a 
polytime computation G € CLS ~ (F,C,2z) © SMU x CL x LIT, withxn Ee Ce F, 
such that GE SAT © (F \ {C}) uf{C \ {a}} © MU ?) 

And is the promise problem for input F € MU, whether F is marginal, NP- 
complete? (That is, is there a polytime computation G € CLS ~~ F € MU, such 
that GE SAT iff F is marginal?) 


Back to saturation: precisely all saturated clause-sets except the A(V) are ob- 
tained as non-trivial saturations of some minimally unsatisfiable clause-set: 


Corollary 3.11 Consider Fe SMU. 
1. F is trivially the saturation of itself. 


2. If F = A(var(F)), then this is also the only possibility for F being a saturation, 
that is, if F is the saturation of some F’ € MU, then we have F’ = F. 


3. Otherwise F is a saturation of some clause-set other than it itself, that is, if 
F + A(var(F)), then there is some F’ © MU with F' # F such that F is a 
saturation of F". 


Proof: Part |l| is trivial. For Part P| assume that F = A(var(F)), and we have 
F = S(F",C,«) for some F’ € MU: But since F is marginal, F’ is not minimally 
unsatisfiable. Finally for Part B note, that if F A A(var(F’)), then by Lemma 
F is not marginal, and thus there is C € F and x € C such that for C’ := C \ {a} 
and EF’ := (F \ {C})u{C’} we have F’ « MU. Now F = S(F’,C’, 2). 


As discussed above, we expect the decision whether for inputs F € SMU, 
C ¢€ F and a € C we have (F \ {C})u{C’} © MU to be NP-complete. But 
this decision is easy for F € UHTIT, namely iff no “subsumption resolution” with 
another clause containing % can be performed, that is, there isno D € F witht € D 
and C \ {a} C D: 


Lemma 3.12 Consider F © UHIT, CE F andx eC. Let C’ :=C \ {x}, and let 
F’ := (F \ {C}) u{C’}. Then we have F’ € MU iff there is no D € F'\ {C} with 
Cee. 


Proof: If there is D € F \ {C} with C’ c D, then F” ¢ MU. So assume there is 
no such D. Assume F’ ¢ MU. Thus there is FE € F’ with F” \ {E} unsatisfiable. 
We must have E #4 C”, since otherwise F \ {C’} would be unsatisfiable. Since F 
is hitting, FE clashes with every clause of F’ \ {C’}. It follows that C’ C E must 
hold (since the falsifying assignments for E are disjoint with those for any clause in 
F’\ {C’}), contradicting the minimal unsatisfiability of F’. 


Some examples on removable literal occurrences illustrate Lemma 
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Example 3.13 For F := {{a,b}, {a,b}, {O}} © UHITs=1 we can exactly remove 
one of the two literal-occurrences of b and still obtain a clause-set in MU (of course 
not in UHLIT anymore; the resulting clause-sets are in fact marginally minimally 
unsatisfiable). For Fz = Az = {{1,2}, {-1, —2}, {-1, 2}, {-2, 1}} € UHTTi_, we 


can not remove any literal occurrence without leaving MU (i.e., F2 is marginal). 


3.3 Splitting 


An important (equivalent) characterisation of saturation for F € CLS, as shown in 
ba). is that splitting on any variable v will yield minimally unsatisfiable clause-sets 
(vu > 0) * F, (v > 1) * F. This enables induction on the number of variables, 
which is a central method for this report; see Lemma for the basic example. We 
also have that if for some variable both splitting results are minimally unsatisfiable 
resp. saturated, then this can be lifted to the original clause-set, provided that no 
contraction takes place. These basic facts are collected in the following lemma. 


Lemma 3.14 For all clause-sets F € CLS we have: 
1. FE SMU iff for allvu € var(F) and alle € {0,1} we have (vu > e)*F € MU. 


2. If for some variable v holds (vu > 0) * F € MU and (v > 1)* F € MU, and 
if for all C € F with v € var(C) we have C \ {v,t} € F, then Fe MU. 


3. If for some variable v holds (v > 0)* FE SMU and (v > 1)* FE SMU, 
and if for all C € F with v € var(C) we have C \ {v,0} € F, then FE SMU. 


Proof: Part fl] is Corollary 5.3 in [58]. For Part |2}assume F' ¢ MU; thus there is 
C € F with F’ := F \ {C} €USAT. We consider three cases: 


1. v ¢ var(C): Due to the assumption on subsumption-freeness we have C'U{u} ¢ 
F',. Now C € (v > 0) * F, while ((u > 0) * F)\ {C} = (vu > 0) * FY CUSAT, 
contradicting (v > 0) * Fe MU. 


2. v € C: By assumption holds C’ := C'\ {v} € F’. Now C’ € (v > 0) * F, while 
((u > 0)*F)\{C"’} = (uv > 0)*F’ € USAT, contradicting (vu > 0)*F € MU. 


3. 0 € C: By assumption holds C’ := C'\ {u} € F’. Now C’ € (v > 1) *F, while 
((u > 1)*F)\{C"} = (uv > 1) *F" € USAT, contradicting (v > 1)*F € MU. 


Finally consider Part Bh. By Part A we already know that F € MU holds. Assume 
that F ¢ SMU; thus there is C € F and a literal x with F’ := S(F,C,x) € USAT. 
So by Lemma we have F \ {C} — C’ := Cw{Z}. There exists at least one 
€ € {0,1} with (uv > €) *{C"} AT, and then (uv > e) x (F'\ {C}) E Wu ae) eC’. 
If var(a) = v, then this contradicts minimal unsatisfiability of (v > ¢) * F. And if 
var(x) # v, then (vu > €) * F \ (vu > €) * {C} & ((v > €) * C) U{Z}, contradicting 
saturatedness of (v + ¢) * F by Lemma B.3} 


The additional assumption C \ {v,0} ¢ F for Parts BI Bj is equivalent to saying 
that when applying (v > 0), (v > 1), then no contraction takes place. An alterna- 
tive way of stating this would be to use multi-clause-sets, since then no contractions 
would be performed, and the doubled clauses would destroy minimal unsatisfiability. 
In 62] (Lemma 1) (and in the underlying report (3). Lemma 2.1) that additional 
assumption for Parts By B is missing by mistake: 


Example 3.15 An erample for (v > 0) * F ©€UHTT and (v 4 1) * F €UHTT, 


but F ¢€ MU is trivially given by {1, {v},{U}} (note that F is a set — for a 
multi-clause-set F the contraction would not occur). 
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4 WVariable-minimal unsatisfiability 


In [11] the generalisation of minimal unsatisfiability to “variable-minimal unsatisfia- 
bility” has been introduced, and the class of all such clause-sets is denoted by VMU, 
the set of clause-sets F € USAT such that for every F’ C F with F’ © USAT 
holds var(F"’) = var(F’). The corresponding class VMU of multi-clause-sets is 
invariant under multiplicities. Thus, as with CLEAN (and different from MU), re- 
garding variable-minimal unsatisfiability w.l.o.g. multi-clause-sets can be cast down 
to clause-sets. The basic (trivial) characterisation of VMU is: 


Lemma 4.1 For F € CLS holds F © VMU if and only if F ©USAT and for all 
v € var(F’) holds {C € F': v € var(C)} € SAT. 


By definition we have MU Cc VMU, moreover, as shown in Lemma 6 of ua}, 
for every deficiency k > 2 we have MUsz~, C VMUs=% (for example, for every 
F Ee Musex, k € N, and every non-full clause C € F’, i.e., var(C’) C var(F’), we can 
add to F a full clause subsumed by C, obtaining F’ € VMUsen41 \ MUs=n41)- 

In there is the false statement “VMU Z LEAN”, based on following erro- 
neous example: 


Example 4.2 Page 266] gives the example Fy := {{a}, {b}, {a, b}, {a, b}} with 
the assertion “Fy € VMU\ LEAN”. Obviously we have Fy € VMU, but we 
also have Fy € LEAN. Using the characterisation from [54 (which is the only 
characterisation used in [74)). that F € LEAN holds iff every clause of F can be 
used in a tree-resolution refutation of F, we see this as follows: the sole subset of 
Fy in MU is {{a}, {b}, {a,b}}, while the clause {a,b} can(!) also be used in a 
tree-resolution refutation — it is obviously superfluous, but nevertheless there is a 
tree-resolution refutation using it, namely via ({a} o{G, b}) ofa, b} = {a}. 


Based on the characterisation of lean clause-sets via autarkies, it is easy to show 
that VMU consists of special lean clause-sets (thus Figure 1 in needs to be 
corrected, showing instead that LEAN is indeed a superclass of VMU): 


Lemma 4.3 VMU Cc LEAN \ {T}. 


Proof: While in the characterisation of LEAN via variables usable in resolu- 
tion refutation was (only) used, here we need to use the equivalent characterisation 
via autarkies, shown in Theorem 3.16 in bo), and used as our definition in Sub- 
section }2.4, namely that for F € CLS holds F € LEAN iff there is no autarky y 
for F with var(y) N var(F’) 4 @: if we had such an autarky for F ¢ VMU, then 
p*xF € USAT with y* F C F and var(y * F) C var(F) \ var(F), contradicting 
FEVMU. That we indeed have a strict subset can for example be seen by Lemma 
3.2 in [0], which shows that if we extended a minimally unsatisfiable clause-sets via 
Extended Resolution, then we always stay in LEAN; another example is clause-set 
Fs on Page 266 in [11]. 


Thus it follows VMUs=1 = MUs=; (shown in Lemma 6 in [[L1}), since by Corol- 
lary 5.7 in holds LEANs=1 NUSAT = MUs-1. 

For F € VMU obviously there is some F’ C F with var(F’) = var(F’) and 
F’ € Mu; Lemma 5 in asserts the converse, but this is false, as the following 
simple example shows: 


Example 4.4 Consider F := {1, {v}, {0}} and F’ := {{v}, {u}}; we have F" € 


MU and var(F"’) = var(F), but F € VMU, since {L} USAT. If we don’t want 
to use the empty clause, then we can consider any F’ © MU with v © var(F") and 
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{{v}, {a} } NF’ = 0, and let F := F’u{{v}, {a}} — again we have F’ € MU and 
var(F’) = var(F’), but F € VMU. 


The corrected version of Lemma 5 from (4) is as follows: 


Lemma 4.5 For F € CLS let Up be the set of F’ C F with var(F’) = var(F), 
O(F’) > 1, and F’ € USAT. Then F € VMU if and only if F € Ur and all 
minimal elements of Ur w.r.t. the subset-relation are minimally unsatisfiable. 


Proof: The condition is necessary, since if F € VMU, then on the one hand we 
have F € LEAN \ {T}, and thus 6(F) > 1 by bo (or use Lemma 3 in [L1l]); and 
on the other hand if there would be a minimal element F’ € Up which wouldn’t 
be minimally unsatisfiable, then there would be some F” Cc F” with F” € MU, 
whence by definition of Ur we get var(F”) C var(F’) contradicting F € VMU. 
For the other direction assume, that we have Ur as specified, and we have to 
show F € VMU. Since F € Up, we have F € USAT. Consider now some F” C F 
with F’ € USAT, and assume var(F"’) C var(F’). Consider some minimal F” € Up 
(regarding inclusion) with F’ C F” C F. Furthermore consider a minimal element 
G € Ur with G C F”; by assumption G € MU, and since F’ C F”, we have 
G Cc F". If for C € F” we had var(F” \ {C}) C var(F”), then there would be 
x € C such that «x or & is pure in F’”, thus also pure in G, whence C ¢ G (since 
G € MU), contradicting var(G) = var(F’). Now choose some C € F” \ F’ (we 
have var(F” \ {C}) = var(F”)); by minimality of F” we now have 6(F” \ {C}) <0 
(otherwise all conditions for Ur are fulfilled for F” \ {C}), whence 6(F”) = 1. 
But then due to var(F”’) = var(G) and G C F” it follows 6(G) < 0, contradicting 
Ge Mu. 


The following examples show applications of Lemma 1-5} 


Example 4.6 Consider the two (non-)examples from Example 4.4: 
1. For F = {1, {v}, {U}} we have the minimal element {L, {v}} of Ur which is 


not minimally unsatisfiable. 


2. For F = F’u{{v}, {0}} (note that from the assumptions follows var(F’) D 
{v}) consider a minimal F"” C F" with var(G) = var(F') and 6(G) > 1 for 
G := F" Ut{{u}, {0}} (note T CF” C F): now G is a minimal element of 
Ur which is not minimally unsatisfiable. 


Based on Lemma 5 in [11], also the proof of Theorem 3 in [fll] is false (the 
procedure goes astray on the clause-sets of Example L.A). Fortunately we can give 
a simple proof of the assertion, which even shows fixed-parameter tractability (fpt) 
of the decision problem “F € VMUs=, ?” in the parameter k: 


Theorem 4.7 Membership “F € VMUs=x ?” for input F € CLS is fpt in the 
parameter k € Z. 


Proof: If F ¢ MLEAN, then F ¢ VMU (by Lemma 3). So we can assume now 
F € MLEAN, and thus we have 6(F’) < 6(F) for all F’ c F. Now the decisions 
of Lemma fh.) as discussed in Example are fpt in k. 
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5 Eliminating and creating singularity 


In this section we continue the study of the handling of singular variables in min- 
imal unsatisfiable clause-sets, as initiated in 64 65}. In Section we study the 
reduction process, eliminating singular variables. A main insight is Lemma 5.4, 
showing that the elimination is harmless concerning the minimum variable degree. 
In Subsection f.9 we introduce the inverse elimination (“extension” ); the main point 
here is the precise statement of the various conditions. Finally in Subsection 5.4 we 
consider a special case of singularity, namely unit-clauses. 


5.1 Singular DP-reduction 


In (65) (Section 3) the process of “singular DP-reduction” has been studied for 
minimally unsatisfiable clause-sets. By it we can reduce the case of arbitrary 
F € MU to (non-singular) F’ € MU’ (that is, for every v € var(F’) we have 
Idp(v), ld (0) > 2). The definition is as follows (see Definition 8 in me 


Definition 5.1 ((65]) The relation F 22"; F’ (singular DP-reduction) holds for 
clause-sets F, F’ € CLS, if there is a singular variable v in F’, such that F" is ob- 


tained from F by DP-reduction on v, that is, F’ = DP,(F). The reflexive-transitive 


closure of this relation is denoted by F lem F’. 


By sDP(F) Cc MU for F € MU the set of non-singular F’ € MU with F SEF 


is denoted. For us the main property of sDP(F) is that it is not empty. In [65] it is 
shown that for S € SMU we have |sDP(F’)| = 1, and that for arbitrary F € MU 
and EF", F” € sDP(F’) we have n(F’) = n(F”). 


Example 5.2 In (65) the following is shown for F € MU: 
1. For 6(F) = 1 we have sDP(F’) = {L}. 
2. For 0(F’) = 2 all elements of sDP(F) are isomorphic. 
3. For 6(F) > 3 in general there are non-isomorphic elements in sDP(F). 


By the results of Sections 3.1, 3.2 in 65 we have the following basic preservation 
properties: 


Lemma 5.3 ((65]) For F,F'€ MU with F 225, F’ we have: 


1. 6(F’) = 6(F). 
2FEMUs=F’eEMU. 

3 FESMU => FE SMU. 
4. F CUHIT = F" CUHIT. 


Although singular DP-reduction can reduce the variable-degree of some vari- 
ables, it can not decrease the minimum variable-degree: 


Lemma 5.4 For F,F’ € MU with F 225, F’ we have pvd(F’) > pvd(F). 
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Proof: It is sufficient to consider the case F’ = DP, (F) for a singular variable v. 
Assume pvd(F’) < pvd(F); thus var(F’) 4 @ (otherwise we have pvd(F’) = +00), 
and we consider w € var(F’) with vdp(w) = pvd(F"). So we have vdr(w) < 
vdr(w), and thus by Lemma 24 in (65, for all clauses C € F with v € var(C) we 
have w € var(C). But then pvd(F") = vdr(w) > vdr(v) > pvd(F) > pvd(F"), a 
contradiction. 


Thus in order to determine the minimum variable-degree for minimally unsatis- 
fiable clause-sets in dependency on the deficiency, w.l.o.g. one can restrict attention 
to saturated and non-singular instances: 


Corollary 5.5 For all k € N holds: 
1. pvd(MUs=x) = pvd(SMUj_,). 
2. pvd(UHTTs=p) = uvd(UHTT;_,). 


Proof: For Part fi] we note that by Lemma 5.4 for every F € MUs=x we can find 
F’ € SMU4_,, with pvd(F’) > pvd(F), and thus pvd(MUs=x) < wvd(SMUj_,), 
while pvd(MUs=z) > uvd(SMUs_;,,) holds due to SMUs_;,, C MUs=p. The same 
reasoning applies for Part 4 


See Lemma for some conditions under which k € N+> pvd(UHTTs=,,) and 
k EN pvd(MUs=,) would be computable. 
5.2 Singular DP-extensions 


We consider now the reverse direction of singular DP-reduction, from DP, (F) to F, 
as a singular extension, and also generalise it to arbitrary clause-sets. This process 
was mentioned in (65), Examples 15,19,54] for minimally unsatisfiable DP, (F’), called 
“inverse singular DP-reduction” there: 


Definition 5.6 Consider a clause-set G € CLS, a variable v € VA \ var(G), and 
meé€EN. A singular m-extension of G with v is a clause-set F € CLS obtained 
as follows (employing four choice steps): 


1. m different clauses Dy,...,Dm € G are chosen. 

2. A subset C C (\"., Di is chosen. 

. A literal x with var(x) = v is chosen. 

. Clauses Di € CL fori € {1,...,m} with (D;\C) C Di C D; are chosen. 


. Let C’ := Cu{x}, and let DY := Di u{z} fori € {1,...,m}. 


SD aA B® 


. F is obtained from G by adding C’ and replacing the D,; with DY: 


P= (@\ {Dijsees Dm DUC" DY, co, DY}. 


m 


Example 5.7 Consider G := {{a,b,c}, {a,b,a}}, m := 2, and the choices C := 
{a},  := v, and D‘, := {b,c}, DS := {a,b,t}. Then the 2-extension F of G is 
F = {{v, a}, {0, b,c}, {0, a,b, ch}. 


By definition we have for an m-extension F' of G € CLS with v the following 
simple properties: 


1. c(F) =c(G) +1, n(F) = n(G) +1, 6(F) = 4(G). 
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2. v is singular for F', vde(v) =m-+1. 
3. DP, (F) =G. 
We now show that indeed the process of Definition is precisely the inversion 


of singular DP-reduction: 


Lemma 5.8 Considerm é€N, G,F € CLS andv € VA. Then F is an m-eztension 
of G by v aff the following conditions are fulfilled: 


1. v is singular for F; 
2. vdr(v) =m+1; 
3. DP, (fF) =G, c(G) =c(F) - 1. 


Proof: If F is an m-extension of G by v, then the three properties hold, as we have 
already mentioned. So assume these three properties hold. Now let the m clauses 
D,,..., Dm be the result of singular DP-reduction on v for F’; they must be pairwise 
different, and all m resolutions must be possible, otherwise c(G) < c(F’) — 1. And 
let C' be singular occurrence of v minus the variable v. Now all properties of a 
singular m-extension are easily checked. 


Singular extensions behave well regarding minimal unsatisfiability: 


Lemma 5.9 Consider m € N, G € CLS and an m-extension F of G by v € VA. 
Then FEMUSGe MU. 


Proof: This follows by Lemma B.g together with Lemma 9, Parts 1, 2 in [65}. 


In the situation of Lemma regarding saturatedness we only have the direc- 
tion FE SMU > GeESMU, while for the other direction the conditions of (65, 
Lemma 12] need to be observed (this would yield “saturated extensions”, which 
however we do not need here). 


5.3. Unit clauses 


We conclude this section by considering unit-clauses in minimally unsatisfiable 
clause-sets. The following (fundamental, simple) lemma is proven in (65) (Lemma 
14); there in Subsection 3.3 one finds further information on unit-clauses in mini- 
mally unsatisfiable clause-sets. 


Lemma 5.10 ((65]) Consider FE MU. 
1. Ifv is full and singular in F, then we have {v} € F or {t} € F. 


2. If {x} € F, then v := var(x) is singular in F (with \dp(x) =1). If here F is 
saturated, then v is also full in F. 


So unit-clauses in minimally unsatisfiable clause-sets are strong cases of singular 
variables. They can obviously be removed by singular DP-reduction, while singular 
> 2-extensions can not remove all unit-clauses: 


Lemma 5.11 Consider a clause-set F € MU containing at least one unit-clause, 


and obtain F’ from F by a singular m-extension, where m > 2. Then also F’ must 
contain at least one unit-clause. 
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Proof: For a unit-clause {x} € F' to be removed in F”, it needs to be one of the D; 
(using the terminology of Definition 5.4). Then the intersection C must be empty 
(otherwise any other D; needed to contain x, and since m > 2 this would mean a 
subsumption in F’). Thus the extension introduces the new unit-clause C’. 


The following examples shows that the assumptions F € MU and m > 2 in 
Lemma are needed: 


Example 5.12 First consider F := {{a}, {a,b}, {a,b}} © MU. Via a 1-singular 
extension we obtain F' := {{v, a}, {v, a}, {a,b}, {a,b}}, which has no unit-clauses. 
A 2-singular extension of F, which touches {a}, has C = L, and thus C’ is a new 
unit-clause. If on the other hand we consider F := {{a}, {a,b}} € CLS \ MU, then 
F’ := {{v, a}, {U, a}, {0, b}} ts a 2-extension without unit-clauses. 


For certain F € MUs5~2 the existence of a unit-clause is actually necessary for 
singularity: 


Lemma 5.13 Consider F € MUjs=2 with uvd(F’) > 4. Then F is singular if and 
only if F contains a unit-clause. 


Proof: That if F contains a unit-clause, then F' must be singular, follows by 


Lemma 5.10, Part 2| So assume now that F is singular, and we have to show that 
g 


: 7 e : sDP sDP 
F contains a unit-clause. Consider a reduction sequence F = Fy —> Fi, —> 


ae F,, where F,, is non-singular (note m > 1). So there exists n > 2 such 


that F,, is isomorphic to F;, (recall Example B.3), and thus every variable of F;, 
has degree 4. So by Lemma 55.4] we know juvd(F;) = 4 for 7 € {0,...,m}. We show 
by induction on m that F' contains a unit-clause. If m = 1, then in order to obtain 
the min-var-degree of at least 4, at least 3 side-clauses D,,...,D3 € Fy, for the 
singular extension have to be chosen (using Definition 5.6), but every literal occurs 
precisely twice in F,, (because of variable-degree 4 and non-singularity), and thus 
the intersection C' has to be empty, and the new clause introduced by the singular 
extension is a unit-clause, whence F' contains a unit-clause. Finally assume m > 1. 
So by induction hypothesis, F, contains a unit-clause, and thus by Lemma also 
Fo contains a unit-clause. 


We will later see (Theorem B.9) that the condition pvd(F’) > 4 in Lemma 
is equivalent to pvd(F') = 4; the following examples show that this condition can 
not be improved: 


Example 5.14 A 1-singular extension of Ag is 


fae CBee) eG Dee 9 Se fee Os De) Oe oe ae 


where F\ is singular, has no unit-clause, and pvd(F,) = 2. While a 2-singular 
extension of Ag is 


Fo 1118 1 8 A a a = Se 


where F2 is singular, has no unit-clause, and uvd(F2) = 3. 


We conclude with a simple form of adding a new variable, by adding it in one 
sign as unit-clause, and adding it in the other sign to all given clauses: 


Definition 5.15 A full singular unit-extension of a clause-set F € CLS (by 
unit-clause {x}) is a clause-set F’ € CLS obtained from F by adding a unit-clause 
{x} with var(x) ¢ var(F’), and by adding literal Z to all clauses of F, 1.e., F’ := 
{{x}} u{Cu{z}: C © F} for some x € LTT \ lit(F). 
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A full singular unit-extension F’ of F' by {x} is a case of a singular c(F’)-extension 
of F with var(«), and thus F’ —s FP) 


Example 5.16 Starting from {1}, the (up to the choice of the new literal) first full 
singular unit-extension is {{v}, {U}}, the second one is {{w}, {v, W}, {0, D}}. In this 
way we get special examples of SMUs=1 (since we started with {L} € SMUs=1). 

If we start with T instead, then first we get {{v}}, and then {{w}, {v, w}}. 

Example 15, Part 1, in [64] contains two example of “inverse unit elimination”, 
where Example (a) there is an example of a full singular unit-extension, while Ex- 
ample (b) there would be a non-full singular unit-extension (where the new variable 
is not full; not used in this report). There is the dual notion of a “full variable” 
for a clause-set F, which is some element of (\ocp var(C), which explains why we 
speak of a “full extension” (namely the new variable is full). 


The process of full singular unit-extension of a clause-set F' maintains many 
properties of F’', and we list here those we use: 


Lemma 5.17 Consider a full singular unit-ertension F’ of F (by {v}): 
1. n(B") = n(F) +1 and c(F") = c(F) +1. 

. O(B") = o(F). 

o(F’) =o(F) for FA {1}. 

. pvd(£") = pvd(F) for n(F) > 0. 

. F' is satisfiable iff F is satisfiable. 

. For F#T: F’ is lean iff F is lean. 


rR DBD aA K WH ®W© 


. F" is (saturated) minimally unsatisfiable iff F is (saturated) minimally unsat- 
isfiable. 


8. F’ is hitting iff F is hitting. 


Proof: Parts fl, B| follow directly by definition. For Part Bi we notice that for F = T 
we have o(F’) = o(F) = 0, while for n(F’) > 0 consider 0 Cc V C var(F’): ifu ¢ V, 
then F’[V] = F[V], and thus the minimisation for o(F’) is included in o(F”), and 
if v € V, then 6(F’[V]) = c(£") — |V| > 6(F") = 6(F) > o(F), and thus these V do 
not contribute to the minimisation. 

For Part] we just note that the variables of F keep their degrees in F’, while the 
new variable has degree vd’ (v) = c(F’) > c(F), and thus does not contribute to 
the min-var degree. Part Bis trivial, and follows also by the satisfiability-equivalence 
of DP,(F’) and F. For Part d we note, that an autarky for F’ involving v must 
be a satisfying assignment for F’, while the autarkies for F’ not involving v are 
the same as the autarkies for F’. Part ai concerning (just) minimal unsatisfiability 
follows with Lemma 9 in 4. while regarding saturatedness we can use Lemma 12 
in [65] (both assertions also follow easily by direct reasoning). Part Rj is trivial. 


So our fundamental classes are respected by full singular unit-extension: 


Corollary 5.18 If F € Muszx (k € N), then every full singular unit-extension is 
also in MUs=x. If furthermore F is saturated resp. hitting, then every full singular 
unit-extension is also saturated resp. hitting. 


12) The case m = 0 is excluded in Definition 5.4, since it is not needed, and would only complicate 
the formulation. 
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Obviously, full singular unit-extension is unique up to isomorphism: 


Lemma 5.19 Consider a clause-set F € CLS and clause-sets F’, F” € CLS ob- 
tained from F' by repeated full singular unit-extensions. Then EF’, F"” are isomorphic 
if and only if n(F’) = n(F"). 


Proof: The number of repeated full singular unit-extensions leading to F” resp. 
F” is the number of variables in these clause-sets with degree strictly greater than 
c(F), and sorting these variables by increasing degree yields the sequence of exten- 
sions. Thus just from knowing the number of variables in F’, F’” we can reconstruct 
them up to isomorphism (using that a full singular unit-extension of F' by {x} is 
isomorphic to one by {y}, for arbitrary literals x, y with new variables). 


6 Full subsumption resolution / extension 


In this section we investigate the second reduction concept for this report, “full sub- 
sumption resolution”. As with singular DP-reduction from Section bh, in general the 
reduction uncovers hidden structure, while the inverse process, “full subsumption 
extension”, serves as a generator for minimally unsatisfiable clause-sets with various 
properties. However in this report, unlike with singular DP-reduction, we will not 
consider full subsumption resolution for arbitrary F' ¢ MU, but only starting from 
some A(V), while a deeper use will be important in (67). Subsection discusses 
the basic definitions (there are various technicalities one needs to be aware of), and 
first applications are given in Subsection 

The basic idea is, for a clause-set F' containing two clauses RU{v}, RU{d} € F, 
to replace these two clauses by the clause R, i.e., we consider the case where the 
resolvent R of parent clauses C’,D subsumes both parent clauses (thus the name). 
This is a very old procedure, based on the trivial observation that (R V v) A (RV 7) 
is logically equivalent to R. If we perform this in the inverse direction, as an “ex- 
tension”, then every clause-set F € CLS can be transformed into its “distinguished 
CNF” FE’ C A(var(F)) (just expand every non-full clause), which is uniquely deter- 
mined. We however have to be more careful about deficiency and membership in 
MU, and thus will consider only “full subsumption resolution”, where the resolvent 
must not be present already, while for the “strict” form additionally the resolution 
variable v must occur also in other clauses. Then from A(V) by strict full subsump- 
tion resolution we can obtain precisely the F € UHIT with var(F’) = V (Lemma 
6.9). For the inverse forms, we have to be even more carefully, making sure that 
neither any of the two parent clauses is already present (this prevents the above 
expansion of arbitrary F € CLS to A(var(F))). 

The (more general) well-known “subsumption resolution” is the reduction F ~ 
(F\ {C}) U{C \ {x}} for F € CLS, that is the removal of a literal « € C from a 
clause C' € F, in case there exists D € F with 7 € D and D \ {%} C C (note that 
CoD =C\ {x} subsumes C). An early use is in [85] (under the name “replacement 
principle”), while the terminology “subsumption resolution” is used in [29] (for 
SAT solving). The earliest sources with a systematic treatment appear to be bo, 
Section 7] and Section 7]. An experimental study of the practical importance 
of subsumption resolution in connection with DP-reductions F ~+ DP,(F') (under 
suitable additional conditions to make DP-reduction feasible; see Subsection 1.3 in 
(Gel for an overview on such restrictions) is performed in [Bo (under the name of 
“self-subsuming resolution”), continued in BA. A theoretic (similar) use one finds 
in [79, Section 4], where a variable v is called “DP-simplicial” for F € CLS iff all 
resolutions performed by the reduction F' ~» DP,(F’) are subsumption resolutions. 
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Our special form, where both parent clauses are subsumed we call full sub- 
sumption resolution, namely the reduction F ~ (F \ {C,D})WU{C oD} in case of 
C,D € F such that CN D = {x} and C \ {xz} = D \ {z}. A main tool is Lemma 
6.5, where especially Part 6] is somewhat subtle, and can be easily overlooked. Via 
this tool we have a controlled way of transforming F € MU resp. F © UHITT 
into A(var(F’)), and in Theorem this yields the determination of the possible 
numbers of variables and clauses in minimally unsatisfiable clause-sets of a given 
deficiency. 


6.1 Basic definitions 


Before defining “full subsumption reduction” F ~ (F \ {RU{v}, RU{T}}) u{R} 
in Definition (so R is new and the two clauses RU{v}, RW{T} vanish), we 
introduce the “strict” form, which is more important to us, and which has the 
additional condition that v must still occur (in other clauses of F’; the “non-strict” 
form on the other hand guarantees that v vanishes (see Definition 6.7): 


Definition 6.1 For clause-sets F,F'’ € CLS by F SBR, we denote that F’ is 
obtained from F by one step of strict full subsumption resolution, that is, 

e there is a clause R € F" 

e and a literal x with var(x) ¢ R 

e such that for the clauses C := RU{x} and D := Rw{z} 

e we have F = (F’ \ {R}) u{C, D}; 


e we furthermore require var(x) € var(P"). 


As usual, the literals «,% are the resolution literals, var(a) is the resolution 
variable, C,D are the parent clauses, and R is the resolvent. 
We write F —. F’ fork € No if exactly k steps have been performed, while we 


write F “2%... Fr for an arbitrary number of steps (including zero). 


We require R ¢ F, that is, the (full subsumption) resolvent is not already present 
in the original clause-set. This is of course satisfied if F € MU. We also require 
that the variable v does not vanish, for the sake of keeping control on the deficiency. 


Example 6.2 Some simple examples are: 


1. Fr = {{1,2},4 1, 2},{ Loh 2, ne, > {{2}, {=L 2}, {—2, I}} and 
no further reduction is possible (note that the only possibility is blocked, since 
variable 1 would vanish). 


2. {{v}, {o}} Baie {Ll}, as v vanishes, while {{v}, {0}, {v, c}} a. {L,{v,x}}. 
3. {{u, wh, {0, wh, {v, oF} a {{vu, w}, {{w}, {v, w}}}, as one parent clause is 
kept, while {{v, w}, (0, w}, {v, WH} “2% {fw}, {v, WHY. 


4. {{v, w}, {0, w}, {v, DB}, {v}, {w}} can not be reduced by strict-full-subsumption 
resolution, since all possible resoluvents are already there. 


sfsR 


The expansion of a clause R to two clauses RU{v}, RU{T} under the above 
requirements is called “extension”: 
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Definition 6.3 For clause-sets F,F’ € CLS we say that F is obtained from F" 


by strict full subsumption extension if F ae Ft , Ane fork € No we say that 


F is obtained from F" by strict full subsumption extension with k steps 
z sfsR a 

if F —», F". 

So one step of strict full subsumption extension for a clause-set F’ uses a non-full 
clause R € F and a variable v € var(F’) \ var(), and replaces R by the two clauses 
Ru{v}, Ru{t}, where none of them is already present. 


Example 6.4 From {{a}, {b}} by one step of strict full subsumption extension we 
can obtain {{a, b}, {a, b}, {b}} or {{a}, {a,b}, {a, b}}; note that no new variable has 
been introduced, that the original clause ({a} resp. {b}) vanished, and that its re- 
placement clauses were not already present. For {{a,b}, {a}} no strict full subsump- 
tion extension is possible. Further examples are obtained by “reading Example 
backwards”. 


The basic properties of strict full subsumption resolution are collected in the 
following lemma. 


Lemma 6.5 For clause-sets F, F’ € CLS with F BLE EF’ (k No, with resolution 
variable v and resolvent R) we have: 
1. F" is logically equivalent to F. 
. var(F’) = var(F’). 
c(E") = c(F) —k, 6(F’) = 0(F) —&k. 
. pvd(F’) > pvd(F’). 
.FeMUus Fle Mu. 


DA A KR ww ® 


. Ifk =1 and F’ € MU, then exactly one of the following three possibilities 
holds: 


(a) S(F"’, R,v) is a partial saturation of F’ (recall Definition (3.4). 
(b) SCF", R,B) is a partial saturation of F’. 
(co) FE MU. 


7 FESMUS F' EC SMU. 
8 FEHIT SF’ EHIT. 


Proof: Parts (i, a Bh 4 follow directly from the definition. Parts 6, fa hold since we 
strengthen two clauses into one, which on the other hand is logically equivalent to 
its parent clauses. Part §| follows by trivial combinatorics. 

Now consider Part |§. That the two possibilities for partial saturation exclude 
each other follows by Lemma (and F’\ {R} / R). And that each possibility 
for partial saturation excludes F € MU follows by definition. Finally, that the 
negation of the two partial saturation possibilities implies F € MU follows again 
by Lemma B.8} 


Part id of Lemma handles a subtle source for errors: One could easily think 
that for F’ € MU astrict full subsumption extension yields another F € MU, but 
this is not so, as there are three possible cases to be considered here, illustrated by 
the following examples: 
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Example 6.6 Consider F := {{v, a}, {0, a}, {0}, {v,a}}. So F ped 2 for F’ := 
{{a}, {0}, {v,a}}. We have F’ © MU, but F ¢ MU, and indeed S(F"’, R,v) = 
{{a, v}, {U}, {v, G}} is a partial saturation of F’ (while S(F", R,D) isn’t one). 


The condition on the resolution variable for strict full subsumption resolution 
(that it must not vanish) is exactly needed for Parts Ay 3 of Lemma 6.5, If this 
condition is dropped, then we speak of full subsumption resolution: 


Definition 6.7 full subsumption resolution is defined as strict full subsump- 
tion resolution, but now the resolution variable is allowed to vanish. If the resolution 
variable definitely vanishes, then we speak of if non-strict full subsumption res- 
olution. In the other direction we speak of full subsumption extension resp. 
non-strict full subsumption extension. 


So if F’ is obtained from F' by one step of non-strict full subsumption extension, 
then we have c(F’) = c(F) +1, n(F") = n(F) +1 and 6(£") = 0(F). 


Example 6.8 Considering the non-examples from Example (6.4: 


1. {{u}, {U}} Blin {L}, but by full subsumption resolution we obtain {L}. 


2. {{v, w}, {U, w}, {v, oF} a {{vu, w}, {{w}, {v, }}}, and the transition is 


also not possible by full subsumption resolution. 
3. {{u, w}, {0, w}, {v, o}, {v}, {w}} is irreducible by full subsumption resolution. 


As follows from the characterisation of SMUs=, = UHTT5=1 in KY, a clause-set 
F € CLS can be reduced by a series of non-strict full subsumption resolutions to 
{Ll} of Fe SMUsa1 = UHTT 521. 


6.2 Extensions to full clause-sets 


If we start with the full clause-sets A(V), then by strict full subsumption resolution 
we obtain exactly all unsatisfiable hitting clause-sets: 


Lemma 6.9 If for some finite V C VA we have A(V) ply F, then F © UHIT 
sfsR 


holds. And for F ©UHTT we have A(var(F’)) ——-» F. 
Proof: The first part follows by Lemma 6.5, Part § (and A(V) € UHIT). And 
for the second part note, that if F € “HTT has a non-full clause, then an strict 
full subsumption extension step can be applied, where the result is still in “HTT 
(again by Lemma (6.5), Part Bi if F has only full clauses, then F = A(var(F’))). 


Recall that in Example we have seen, that strict full subsumption extension 
does not maintain minimal unsatisfiability in general. But from arbitrary mini- 
mally unsatisfiable F we can obtain A(var(f’)), when we additionally allow partial 
saturation: 


Lemma 6.10 For F © MU we can obtain A(var(F')) from F by a series of strict 
full subsumption extensions in combination with partial saturations. 


Proof: If F €¢ MU has a non-full clause, and if strict full subsumption extension 
can not be applied in order to obtain F’ € MU, then by Lemma 6.5} Part fd. a 
partial saturation is possible. 


We obtain sharp upper bounds on deficiency and number of clauses in terms of 
the number of variables: 
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Corollary 6.11 For F € MU holds: 
1, 6(F) < 2°) — n(F). 
0. aPy ego, 
In both cases we have equality iff F is full (i.e., F = A(var(F))). 


Proof: For Part fi] note that by Lemma 6.10] we can transform F' into A(var(F’)) by 
a series of steps not decreasing the deficiency. Thus 6(F) < 6(A(var(F))) = 2") — 
n(F’). For Part BI note c(F) = 6(F)+n(F) < 2") (by Part i) For F = A(var(F)) 
these inequalities are indeed equalities. If we had 6(F) = 2") —n(F) for some non- 
full F’, then some strict full subsumption extension must be possible, contradicting 
the upper bound of Part fil And if we have c(F’) < 2”(F) for some non-full F, then 
again some strict full subsumption extension must be possible, contradicting the 
upper bound of Part a 


We explicitly state the instructive reformulation, that the A, are the minimally 
unsatisfiable clause-sets of maximal deficiency for given number n of variables: 


Corollary 6.12 Consider m € No and F € MUn=m such that 6(F) is maximal) 
Then F = A(var(F)). Thus the maximal deficiency for F € MUn=m is 2™ —m 
(realised by Am € MUn=m A MUsaom—m). 


So for m = 0,1, 2,3, 4, 5,6 variables the maximal deficiency of minimally unsatis- 
fiable clause-sets is 1,1, 2,5, 12,27, 58; in general the deficiencies of the form 2” —m 
are central for our investigations (note that the function m € Nop + 2” —m €N is 
monotonically increasing). We are now able to determine the numbers of variables 
and numbers of clauses possible for minimally unsatisfiable clause-sets with a given 
deficiency: 


Theorem 6.13 For k € N let o(k) € No be the smallest n © No with 2? -—n>k. 
1. {n(P): Fe MUsex} = {n € No: n = o(k)}. 
2. {c(F): F € Musezx} = {n EN: n> o(k) +k}. 


Proof: Part y follows by Part fi so it remains to show Part fi}. By Corollary 
we see that the left-hand side is a subset of the right-hand side. To show the 
other direction, we first note that increasing the number of variables by keeping the 
deficiency constant is achieved by one non-strict full subsumption extension step. 
It remains to show the existence of F € MUs-, with n(F’) = o(k). For k = 1 
we have F = {1}, so assume k > 1. Let Fo := A(o(k) — 1) (so (Fo) = k-1; 
note o(k) — 1 > 1). Add a variable by one step of non-strict full subsumption 
extension, obtaining F, € MU;_,_1 with one new variable, and then take a clause 
in F, without that new variable and perform one step of strict full subsumption 
extension (on that new variable), obtaining F) with n(Ff)) = n(F,) = o(k) and 
6(Fo) = 0(Fi) +1=k. 


o(k) for k > 1 by definition is the smallest n > 0 with 6(An) > k, and by 
Theorem 6.13} it is the smallest n > 0 such that there is F € MUs=, with n(F’) = n. 
We have o(1) = 0, 0(2) = 2, 0(3) = --- = o(5) = 3, o(6) = --- = o(12) = 4 
and 0(13) = --- = o(27) = 5. Except for the first term, the sequence (0(k))xen 


is sequence fnattp://oeis.org/A103586 in the “On-Line Encyclopedia of Integer 


Sequences”. 


13) That is, F € MU, n(F) =m, and for all F’ € MU with n(F’) = m we have 6(F’) < 6(F). 
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7 Non-Mersenne numbers 


In this section we study the function nM : N > N via a recursive definition (Defini- 
tion [7.1} see Table fil). The understanding of this recursion is the underlying topic 
of this section. This recursion is naturally obtained from splitting on variables 
with minimum occurrence in minimally unsatisfiable clause-sets, and will be used 


in Theorem B.3} later. The sequence nM is sequence http: //oeis.org/A062289] in 


the “On-Line Encyclopedia of Integer Sequences” : 


e It can be defined as the enumeration of those natural numbers containing “10” 
in their binary representation; in other words, exactly the numbers whose 
binary representation contains only 1’s are skipped. 


e Thus the sequence leaves out exactly the number of the form 2" —1 for n € N 
(that is, 1,3,7,15,31,...), whence the name. 


e The sequence consists of arithmetic progressions of slope 1 and length 2” — 1, 


m =1,2,..., each such progression separated by an additional step of +1. 
k 1/2 3 4]5 11] 12 --- 26] 27 --- 57 | 58 
nM(k) |} 2}4 5 6/8 --- 14] 16 --- 30] 32 --- 62] 64 


Table 1: Values for nM(k), k € {1,...,58} 


The key deficiencies in Table fil are the following two classes: 


1. The k-values k = 1,2,5,12,27,58,... are the deficiencies k = 2” — n of the 
clause-sets An, n € N, while the corresponding values nM(k) = 2* are the 
minimum variable-degree of the clause-sets A, (see Lemma 2.13), as explained 


in Subsection [1.3} 


2. The k-values 1,4, 11,26,57,... are the positions just before these deficiencies, 
as also discussed in Subsection we call them “jump positions”, since 
precisely at these positions the function value increases by 2 for the next 
argument (compare Definition (7.19). 


The recursion in Definition is new, and so we can not use these character- 
isations, but must directly prove the basic properties; the deficiencies k = 2” —n 
will be handled in Corollary while the jump positions are handled in Lemma 
(7.20, Later we will obtain two further alternative characterisations of nM: 


e Combinatorial characterisations are obtained in Corollary where we will 
see that nM(k) for k € N is the maximal min-var-degree for lean clause-sets 
or variable-minimal unsatisfiable clause-sets with deficiency k. 


e In Subsection we will develop a general recursion scheme, which has the 
function nM “built-in”, as shown in Theorem [13.15 


Definition 7.1 Fork € N let nM(k) :=2 ifk =1, while else 


nM(k):= max  min(2-7,nM(k —i+1)+2). 
1€{2,...,k} 
The intuition underlying Definition of nM(k), as later unfolded in Theorem 8.3 
is that we want to get an upper bound on the min-var-degree of an F € MUszx 
(recall Definition 2.11}), and for that we consider a variable v € var(F’) of minimum 
var-degree, set it to 0,1, and infer an upper bound on vdr(v) from the two splitting 
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results. The index 7 runs over the possible literal-degrees of v (thus we have to 
maximise over it), where i actually is the maximum degree over both signs, and 
thus we can take the minimum with i +7 for the var-degree. In the splitting results 
(vu + ©) * F (e € {0,1}) the deficiency is reduced by 7 — 1, since 7 occurrences 
(i.e., clauses) and one variable are lost, and we apply recursively the lower bound 
nM(k — (i — 1)), where then the i cancelled occurrences have to be re-added. 


Example 7.2 Computing nM(k) for2<k <5: 
1. nM(2) = min(2- 2,nM(2—2+1)+2) = min(4,4) = 4. 


2. nM(3) = max(min(2 - 2,nM(3 — 2+ 1) + 2), min(2 - 3,nM(3 —3+1)+3)) = 
max(min(4, 6), min(6,5)) = 5. 


3. nM(4) = max(min(2-2,nM(4—2+1)+2), min(2-3,nM(4—3+1)+3), min(2- 
4,nM(4 —4+4 1) + 4)) = max(min(4, 7), min(6, 7), min(8,6)) = 6. 


4. nM(5) = max(min(2-2,nM(5—2+1)+4+2), min(2-3,nM(5—3+41)+83), min(2- 
4,nM(5—4+1)+4+ 4), min(2-5,nM(5—5+1)+5) = max(min(4, 8), min(6, 8), 
min(8, 8), min(10,7)) = 8. 

7.1 Basic properties 


We begin our investigations into nM(k) by some simple bounds: 


Lemma 7.3 Consider k €N. 
1.k+1<nM(k) <2-k fork EN. 
2. Fork > 2 we have nM(k) > 4. 


Proof: The upper bound of Part |1| follows directly from the definition (by the min- 
component 2i). The lower bounds follows by induction: nM(1) = 2 > 1+ 1, while 
for k > 1 we have nM(k) > min(2k,nM(k —k+1)+&) = min(2k,2+k) =k+42. 
Part {i follows by Part fi] and nM(2) = 4. 


A basic tool for investigating sequences is the Delta-operator, which measures 
the differences in values between to neighbouring arguments: 


Definition 7.4 For a sequence a: N— R andk € N let Aa(k) := a(k +1) —a(k) 
be the step in the value of the sequence from k tok+1. 


A few obvious properties of this Delta-operator are as follows: 
1. A: RN > RN is linear: A(A-a+p-b) = A- A(a) + p- A(d). 
2. a ER is constant iff Aa = (0). 


3. a is increasing iff Aa > 0, while a is strictly increasing iff Aa > 0. Here for 
sequences a,b: RN — RN of real numbers we use a < 6:6 Vn EN: an < bn, 
anda<b:3eVnEN: an < by. 


The first key insight is, that the next number in the sequence of non-Mersenne 
numbers is obtained by adding 1 or 2 to the previous number: 


Lemma 7.5 Fork © N holds AnM(k) € {1,2}. 
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Proof: For k = 1 we get AnM(1) = 2. Now consider k > 2. We have 
M(s& + 1) = max(min(4, nM(k) + 2), _ max is min(27,nM(A —i+2)+7)) = 
kt 


max min(2i,nM(k —i+2)+%) = 
4} 


i€{3,...,k 
Hae min(2(¢+ 1),nM(k —-(¢+1)+2)4+(¢4+1))= 
VEY 2,00, 
ee min(2i+2,nM(k—i+1)+7+1) = 1+. feo min(2i+1, nM(k—7i+1)+7). 
GE{2,...,6) 0 GELB, 


Thus on the one hand we have nM(k + 1) > 1 + maxjego,....~} min(27,nM(k — 7+ 
1)+7) =1+nM(k), and on the other hand nM(k +1) < 14+ maxjeqo,...,~} min(2i+ 
1,nM(k —i+1) +i4+1) =2+nM(k). 


jaro 


Thus increasing the deficiency k by one increases nM(k) at least by one: 


Corollary 7.6 nM: N—>N is strictly increasing. 


And changing nM(a + 6) to nM(a) + 6 can not increase the value: 


Corollary 7.7 We have nM(a+ 6) > nM(a) +6 fora € N and b € No, and thus 
nM(a — b) < nM(a) — 6 for b<a. 


Proof: We have nM(a+ b) — nM(a) = ye *AnM(a+i) > b-1, whence the first 
inequality. Applying it yields nM(a — b) + b < nM(a — b+ 6) = nM(a). 


Instead of considering the maximum over k — 1 cases i € {2,...,k} to compute 
nM(k) (according to Definition (7.11), we can now simplify the recursion to only one 
case inm(k) € {2,...,k}, and for that case also consideration of the minimum is 
dispensable. ney is the first index 7 in Definition 7.1] 7.1} where the minimum is 
attained by the nM-term, that is, where 21 > nM(k —7+ 1) +3: 


Definition 7.8 Fork © N, k > 2, let inn(k) € N be the smallest i € {2,...,k} 
with i > nM(k —i+1) (note that k > nM(k —k +1) = 2, and thus inm(k) ts 
well-defined). 


Example 7.9 We have 


1. inm(2) = 2. 

2. inm(3) = 8, since nM(3 — 2+ 1) =4, nM(3—3+41) =2. 
3. inm(4) = 4, since nM(4—3+1) =4, nM(4—4+41) =2. 
4. ipm(5) = 4, since nM(5—3+41) =5, nM(5—4+41) =4. 
5. inm (6) = 5, since nM(6—4+1) =5, nM(6—5+4+1) =4. 


As promised, from inm(k) we can compute nM(k) by one recursive call of nM: 
Lemma 7.10 Fork EN, k > 2, we have: 

1. 0 < in (k) — oM(k — ip (&) +1) < 2. 

2. Ainm(k) E {0, 1}. 

3. nM(k) = nM(k — inm(k) + 1) +:inm(k). 


AT 


Proof: For Part fi| we consider the sequence i > fx(i) := i — nM(k —i 4+ 1); 
this sequence starts with f,(2) = 2 —nM(k —1) < 0, and finishes with f;,(k) = 
k —nM(1) > 2, and ipm(k) is the smallest i with f,(2) > 0. By Lemma 
we have Af, (i) = Ai(i) - AnM(k —i4+ 1)(i) € {14+1,14+ 2} = {2,3}. So for 
inm(k)-nM(k—inm(k)+1) = fr(inm(k)) by definition we have fx(inm(k)) > 0, while 
f(inm(k)) < 2 due to Afx(inm(k)) < 3 (otherwise inm(k) wouldn’t be minimal). 

For Part P} we consider the sequence k ++ gi(k) := i—nM(k—i+1). Again 
by Lemma we get Agi(k) € {—1,—2}. It follows immediately Ainm(k) > 0. 
Now assume Aiym(k) > 1; thus —2 < gj,.,(x)(k +1) < 0, whence, as shown before, 
Gee 1) > -242=0, and thus Aiypm(k) = 1. 

For Part B| we consider the sequence i ++ h;,(7) := nM(k —i+1) +i; by Lemma 
7.4 we have Ah;(i) € {-1+1,-2+1} = {0,-1}. Thus, and by definition of 
inm(k), we get nM(k) = max(2- i +, 2+(inm(&) — 1), he (ina (&))) = max(2 inm(k) — 
2, he(inm (K))). Finally he (ina (K)) 2 2inm(k)—2 = nM(k—inm(k)+1)+2 > inm(k), 
which holds by Part [I] 


We obtain an alternative, functional characterisation of inm(k): 


Corollary 7.11 Fork © N, k > 2 the value ipm(k) € {1,...,k} is uniquely char- 
acterised by the two inequalities 
inm(k) 
inm(k) 


nM(k — inam(k) + 1) 
nM(k — ina (k) + 2) 


2 
S 


Proof: As shown in the first part of the proof of Lemma (7.10, the sequence 7 > 
fe (i) = 4 —nM(k —7+4 1) is strictly increasing. 


7.2 Characterising the jumps 


After these preparations we are able to characterise the “jump positions”, which 
are defined as those k where the function nM increases by 2: 


Definition 7.12 Let J := {k © N: AnM(k) = 2} be the set of jump positions. 


Thus AnM(k) = 1 iff k ¢ J, and by Table [I] we see J = {1,4,11,26,57,...}. Note 
that nM(k) = 1+k+|{k' € J: k’ < kh]. It is useful to define two auxiliary 
functions: 


Definition 7.13 Let i’(k) := k —inm(k) +1 ¢N fork © N, k > 2. And let 
h(k) :=nM(i’'(k)) EN fork EN, k > 2. 


Some basic properties: 
1. We have Ai’(k) = 1 — Aino (Ak). 
2. Thus by Lemma [7.10, Part P| holds Ai’(k) € {0, 1}. 
. By Lemma Part Bh, we have nM(k) = A(k) + inm(k). 
. Thus Ah(k) = AnM(k) — Ainm(k). 
. By Lemmas 7.4 and | Part 2] we get Ah(k) € {0, 1, 2}. 
. By Lemma 7.10} [7.10] Part [I] we have inm(k) — h(k) € {0, 1, 2}. 
. By Corollary [7.1]] we have h(k) = nM(i/(k)) < in (k) < nM(i'(k) + 1). 


N DD oOo FF W 


48 


It is instructive to consider initial values of the auxiliary functions in Table Bt 
this table is constructed as follows: 


e The values for nM(k) are from Table |l| (it would be possible to completely 
construct the whole table row by row, but we leave this as an exercise to the 
reader, once the section is completed). 


e The columns 7’, h duplicate the columns k, nM, but with repetitions. 
e Columns 7’ and inm are connected via inn +i’ =k +1. 
e Columns nM, ip and h are connected via nM = ign +h. 


e The values of column iy) are determined according to Corollary by the 
condition h < inpm < h’, where h’(k) is the nM-value following h(k). 


k nM AnM inm A inM i Ai! h Ah inM —h 
| a | ee | - 
2 4 1 2 1 1 0 2 0 
3 5 1 3 1 1] 0 2 1 
4 6 2 4 0 1 1 2 2 
5 8 1 4 1 2} 0 4 0 
6 9 1 5 0 2 1 4 1 
7 10 1 5 1 3 | 0 5 0 
8 11 1 6 0 3 1 5 1 
9 12 1 6 1 4] 0 6 0 
10 |} 13 1 7 1 4] 0 6 1 
11 |} 14 2 8 0 4 1 6 2 
12 |} 16 1 8 1 5 | O 8 | 0 0 


Table 2: Values of auxiliary functions; underlined the jump positions 
First we show some further simple properties of the auxiliary functions: 


Lemma 7.14 Consider k > 2. 
1. If Ainm(k) = 0, then: 
(a) Ainm(k +1) =1. 


(b) in (kK) — h(k) € {1, 2}. 
(c) inu(k +1) = h(k +1). 


2. Aiam(k) = 1 Ai/(k) = 0 & Ah(k) =0. 


3. If Ainm(k) = 1, then: 


(a) k¢€ J. 
(b) inu(k) — h(k) € {0, 1}. 


Proof: For Part flal assume Ai,m(k +1) = 0 (and thus Ai’(k + 1) = 1 due to 
Ai’ = 1—Ainm). Because of Ah = AnM—Ai,m we obtain Ah(k + 1) > 1. Thus 
inm(k) = inm(kK+2) > h(kK+2) > h(K+1)4+1 = nM(2’(kK4+1))4+-1 = nM(2’(k)+1)4+1, 
contradicting inm(k) < nM(i’(&) + 1). For the remainder of Part fi] note Ah(k) = 
AnM(k) > 1. 

For Part [lb] note ina (k) = inu(k +1) > h(k +1) > h(k) +1. 
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For Part {lc assume ipm(k +1) > h(K +1). Thus inm(k) = inm(& + 1) ; 
h(k +1) +1 > h(k) + 2, whence ipm(k) = A(k) + 2. If we would have Ah(k) = 
then inm(k) = inm(K +1) > h(K +1) = A(k) + 2; thus h(k+ 1) = h(k) +1. 
inm(k) = A(k) +2 = h(K+1) +1 = nM(i’(K + 1)) +1 = nM(’(kK) +1) 41, a 
contradiction. 

Part Bi is obvious, and Part Bal follows. Finally, Part BH follows by inm(k +1) < 
h(k +1) +2 and ‘nat +1) = inm(k) +1, while h(k +1) = h(k) due to Part Bh 
whence inm(k) < h(k) +1 


We obtain the main characterisation of the jump positions via the auxiliary 
functions: 


Theorem 7.15 For k > 2 the following conditions are equivalent: 
lLkeJ 
Ah(k) = 
3. inm(k) = h(k) + 
. Ainm(k — 1) =1 andipm(k -1) =h(kK-1) 41 


nN 


5. ANigpw(k—2) = Aigpm(k—-1) = 1 (yielding various equivalent forms via Lemma 


[7.14 Part m 


Proof: Condition {I | implies Condition BI due to Ainm(k) = 0 in case of k € J by 
Lemma [7.14 (7.14) Part Bal Condition ee Condition J, since Ah(k&) = 2 implies 
Ainm(k) = 0, and so oy Lemma ra raved fid clwe have inm(k) = inm(kK+1) = h(k+1), 
while the assumption says h(k+1) k)+2. In turn Condition fj implies Condition 
fi} since by Lemma ae Bb] we ee fear ) =0, and thus AnM(k) = Ah(k), 
where in case of Ah(k ) < 1 we would have h(k) + 2 = inm(k) < nM(i’(k) +1) = 

M(i’(k + 1)) = h(K +1) < h(k) +1. So now we can freely use the equivalence of 
these three conditions. 

Condition BJ implies Condition since we have Aiyw(k) = 0, and thus Aiym(k— 
1) = 1 with Lemma 7.14, Part flal, from which we furthermore get inm(k) = inm(k— 
1)+1 and h(k-1) = ik), and so inm(k—1) = inm(k) —1 = A(k) +1 = h(K-1)4+1. 
Condition ff implies Condition|§, since in case of A inm(k—2) = 0 we had ip (k—-1) = 
h(k — 1) with Lemma lL Part [ In turn Condition [5] implies Condition Bl since 
inm(k) = inm(k — 1) + = = = 5) +2, while h(k) = h(k — 1) = h(k — 2), where 
by definition inm(k — 2) > h(k — 2) holds, whence inm(k) > h(k) + 2, which implies 
inm(k) = h(k) +2 


We understand now the shape of the four A-sequences: 


Corollary 7.16 By definition the sequence (AnM(k))zen is 1 except at the jump 
positions k, where it is 2. The other three A-sequences are shaped as follows: 


1. The sequence (Ainm(k))xen,p>2 consists of alternating 0,1’s except the two 
positions k —2,k —1 before a jump position k € J, where we have two con- 
secutive 1’s (while at the jump position we have 0). 


2. The sequence (Ai’(k))xen,k>2 consists of alternating 0,1’s except two positions 
before a jump position k, where we have two consecutive 0’s. 


3. The sequence (Ah(k))xen,n>2 consists of alternating 0,1’s except two positions 
before a jump position k, where we have two consecutive 0’s, followed by a 2 
at the ump position k, which is followed by 0. 
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Proof: Part [lk By Lemma we have Aipm(k) = 0 > Ainm(k +1) = 1, 
while by Theorem Part |) we have Ainm(k) = Ainm(kK+1)=1>k+2€J, 
and by Lemma eh we have k € J > Ainm(k) = 0. 

Part [2 follows from Part |l] by Ai’ = 1— Ai. 

Part Bk By Lemma (7.14) Part 3 the 0’s in the sequence Ah are precisely the 1’s 
in the sequence Ainy, while a 0 of Ainy translates into a 2 precisely at the jump 
positions by Theorem (7.15, Part a The assertion follows now by Part (i. 


Especially instructive is understanding of the 7’-sequence: 


Corollary 7.17 The i'-sequence (i'(k))ken,k>2 consists of doublets m,m for con- 
secutivem = 1,2,...,, except fork € J\{1}, where we have at positions k—2,k—1,k 
a triplet m,m,m. These triplet-values occur exactly when m € J. 


Proof: The doublet/triplet structure follows by Corollary [7.16] Part 8. Now 
consider a triplet i’(k — 2) = 7/(k-—1) =(k) =m forke J\{1l},meEN. By 
definition we have AnM(m) = Ah(k) (due to h(k) = h(i’(k)) =nM(m), h(kK+1) = 
nM(i’(k + 1)) = nM(i'(k) +1) = nM(m+1)). By Theorem Part 2] we have 
thus have AnM(m) = 2, i.e, m € J. The triplets do not leave out some jump- 
value in J, since for m € J and for the last position & with i’(k) = m we have 
AnM(m) = Ah(k). 


Example 7.18 We see now how we can built up the three columns k,nM, i’ of Table 
together with an enumeration of the set J, which is built up as the set I (I is an 
initial part of J, which in the limit becomes J): 


1. We start with the first row k := 1, inittialising the value n of nM(1) = n 
to n := 2, while in is undefined; k = 1 is the first jump position, that is, 


I := {i}. 


2. We go to the second row, k:=k+1. We update n :=n+2 and initialise the 
running value of i(k) =m tom:=1. 


3. We repeat the following steps ad infinitum: 


(a) Ifm € I, then three rows are created: 
i. inm(k) =n, U(k) =m, k:=k4+1,n:=n+1 
ti. inm(k) =n, U(k) =m, k:=k+1,n:=n+1 
iit. inu(k) = n, V(k) = m, I = Tu{k}, k= k +1, n= n4+2, 
m:i=mt+i. 
(b) Ifm € I, then two rows are created: 
i. inm(k) =n, (kK) =m, k:=k+1,n:=n+1 
ti. inu(k) =n, U(k) =m, k:=k4+1,n:=n4+1,m:=m+1. 


Next we show that i’(k) for jump positions is the previous jump position: 


Lemma 7.19 Fork € J, k > 2, holds i’(k) = max{k’ € J: k’ < k}. 


Proof: We prove the assertion by induction on & (regarding the enumeration of J). 
We have 7’(4) = 1, and so the induction holds for k = 4, the smallest jump position 
k > 2. Now assume that the assertion holds for all elements of JM {1,...,k —1}, 
where k > 4, and we have to show the assertion for k. By Corollary we know 
i(k) € J, where 2 < i/(k) < k. Assume there is m € J with i/(k) < m<k. By 
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induction hypothesis we get i’(k) < i’(m) < m. However by Lemma we get 
Ai’(m) = 1, and thus 7’(k) > i’(m) (since k > m). 


We obtain the promised characterisation of the jump positions: 


Lemma 7.20 We have J = {21 —(m+1)—1:meN}. 


Proof: Let k,, for m € N be the mth element of J; so the assertion is ky, = 
27+! __m—2. We have kj = 4—1-—2 = 1 = minJ; in the remainder assume 
m > 2. We prove the assertion by induction, in parallel with inm(km) = 2™*!—2™. 
For m = 2 we have ky = 8—-2—2=4=minJ \ {1}, while iny(4) is the smallest 
i € {2,3,4} with i > nM(5 — i), which yields inw(4) = 4 = 23 — 2?. Now we 
consider the induction step, from m-—1 to m. The induction hypothesis yields 
kim—1 = 2™—m—1and in (km—1) = 2-271. Lemma[?.19 yields 2! (km) = km—1, 
from which by t’(km) = km — inm(km) + 1 follows 


kim = 2 —m—2+inm(km).- 
Via a telescoping series we get 
inm (Km) = Aium (km — 1) oe Ainm(km-1) + inm(Km-1)- 


By Corollary Part fi| the sequence Ainm(Km_—1),---;ANinm(km — 1) has the 
form 0,1, 0,1, ...,0,1,1, and thus their sum has the value 4 (km —km—1—1) +1. 
So we get 


(Km — Km—1 —1)+1+inm(km-1) = 


1 
5(2" —m — 2+ ina (hm) 2” oy +1 —1)+1+29" —-g7 1 
1 


5 in (km) ala ek, 


from which inm(km) = 2+! — 2™ follows. Finally km = 2" —m—2+2™t1_2™= 
gmt] _m — 2. 


7.3 Applications 


Now the closed formula for nM(k) can be proven: 


Theorem 7.21 For k € N let fld(k) := |ld(k)|. Then we have fork € N the 
equality nM(k) = k& + fld(k + 14 fld(& + 1)). 


Proof: Let g(k) := fld(k+1+fld(&k+1)) and f(k) :=k+4g(k) (so nM(k) = f(k) is 
to be shown, for k > 1). We have f(1) = 1+fld(2+fld(2)) = 1+fld(3) = 2 = nM(1). 
We will now prove that the function g(k) changes values exactly at the transitions 
ki+k+1 for k € J, that is, for indices k = ky := 2™*1 — m — 2 (using Lemma 
[7.2q) with m € N we have Ag(km) = 1, while otherwise we have Ag(km) = 0, from 
which the assertion follows (by the definition of ./). 

We have g(1) = 1 and g(2) = 2. Now consider m € N and ky, +1< k < ksi. 
We show g(k) = m+ 1, which proves the claim. Note that g(k) is monotonically 
increasing. Now g(k) > g(km+1) = |ld(2™*1—m-+ |]d(2™t!—m)|)| = [ld(2™*1 — 
m+m)| =m-+1and g(k) < g(km41) = |ld(2™*? —m— 2+ [ld(2™*? —m—2)])] < 
[ld(2™*2 —m —2+m-+1)| = [ld(2™*? —1)| =m+1. 


As a result, we obtain very precise bounds for nM(k): 
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Corollary 7.22 + fld(k+1) <nM(k) <k+1+ fld(k) holds fork EN. 


Proof: The lower bound follows trivially. The upper bound holds (with equality) 
for k < 2, so assume k > 3. We have to show g(k) = fld(k+1+fld(k+1)) < 1+fld(k), 
which follows from ld(k + 1 + fld(k + 1)) < 1+ Id(k). Now ld(k + 1+ fld(k + 1)) < 
Id(k +1+1d(k + 1)) <ld(kK +k) =1+41d(k). 


Note that (k+1+ fild(k)) —(k+fid(k+1)) € {0,1}, where this difference is zero 
iff & + 1 is a power of 2. Finally we can prove the already mentioned characteri- 
sation, which motivates the terminology of “non-Mersenne numbers”, namely that 
(nM(k))nen enumerates N\ {2"-—1l:neé Ny} For that we consider the positions 
directly after the jump positions, which by Lemma 7.20, are the positions 2” — n for 
n > 2. From that position on until the next jump position, which is 2"*! — n — 2, 
the nM-values increase constantly by 1 per step. So we just need to understand the 
values of nM(2” — n), to understand all of nM, which is achieved as follows (note 
that (2"+! — n — 2) — (2 —n) = 2” — 2): 


Corollary 7.23 Considern © N, k:=2"—n, andme€ENo withm < 2” -1. 
1. nM(k) = 2”. 
2. More generally for m < 2” —1 holds nM(k +m) = 2"+m. 
3. Form = 2"—1 we have k+m = 2"*1— (n+1), and thus nM(k+m) = 2"*?, 


Proof: By Theorem {7.21} Part [I] follows with nM(2” —n) = 2" —n + fld(2” —n+ 
1+ fld(2” —n+1)) = 2” — n+ fld(2” —n +14 (n—-1)) = 2" — n+ fld(2”) = 2”. 
Part A follows by Lemma 7.20} and Part 3 follows by Part fi. 


Besides nM(2” — n) = 2” also the following special value is of importance: 


Corollary 7.24 FornéN, n> 2, we have nM(2” —n—1) = 2"— 2. 


It is also useful to have simple formulas for the inm(k)-values around the jump 
positions: 


Corollary 7.25 Forn € N, n > 3 the values of in (2" —n +m) are as follows, 
using p:= 2"! (where form = —4 we need n > 4): 

m 4 3 2 1 0 1 2 3 4 

inm | p-2 p-2 p-1l p p ptl pt+l p+2 p+2 


Proof: We have iny(2" — n) = 2"~1 by Corollary |7.11f 


nM(2” —n—2"-' +1) = 
nM(2” —n—2" 142) = 


M(2”-! — (n—1)) = 2"! 
M(2"-? — (n —1) +1) = 27-1? +1. 


nN 
n 


The remaining values follow by Corollary (7.16) Part (i. 
We conclude with an alternative characterisation of the jump-set J: 


Corollary 7.26 Fork © N the following conditions are equivalent: 
1. nM(k) < 2-ipm(k) — 1. 
2. nM(k) = 2-inm(k) — 2. 


14) Note that we are not speaking of “non-Mersenne primes”. 
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3. ke J, that is, k =2™+! —m—2 for somem EN. 


Proof: If k € J, then by Theorem (7.15, Part Bl we have nM(k) = 2-inm(k) —2 < 
2-inm(k) —1. And if k ¢ J, then by the same lemma we have ipm(k) < h(k) +1, 
and thus nM(k) = h(k) + inm(k) > 2-inm(k) — 1. 


8 The min-var-degree upper bound for MU 


In a sense the main auxiliary lemma of this report is the following statement on 
the deficiencies obtained when splitting a saturated minimally unsatisfiable clause- 
set, which receives its importance from the fact that every minimally unsatisfiable 
clause-set can be saturated (recall Subsection B.4. this method was first applied in 
this context in 19}). 


Lemma 8.1 Consider F € SMUs=% fork € N and a variable v € var(F) realising 
the minimum var-degree (i.e., vdp(v) = pvd(F)). Using mo := ldr(U) and my, := 
Idp(v) we have (v > €)* F € Musep_m.41 for € € {0,1}, where n((u > €) * F) = 
n(F)—1. Since minimally unsatisfiable clause-sets have deficiency at least one, we 
get Me <k. 


Proof: We have n((v > ¢) x F) = n(F) — 1 since F contains no pure variable, 
while v realises the minimum of var-degrees. Thus 6((u > ¢) * F) = d(F) —m. +1, 
while (v > ¢) * F € MU by Lemma Part (i. 


Some explanations on this fundamental lemma: 


Example 8.2 If in the situation of Lemma the value of mz is minimal, 1.e., 
m, = 1, then we have d((u > €) * F) = 0(F) = k, while if m, is maximal, 1.e., 
me =k, then we have 6((v > €) * F) =1. The deficiency is strictly decreased for 
both splitting results iff v is non-singular. The point of v realising the minimum 
var-degree is, that we have control over the number of eliminated variables (namely 
no further variable is eliminated). If v € var(F’) is arbitrary, then 6((v > €)* F) = 
k—m-+1+r, where r is the number of variables in F which occur only in the 
clauses containing 0 for € =0 resp. in the clauses containing v fore =1. 

A class of concrete examples is given by the Fn € SMUs_y (n > 2; recall 
Example 3.4), where for every v € var(F,) and « € {0,1} holds (v + €) * Fy € 
MUs=1 (since every literal of F, has degree 2). 


The definition of nM(k) (recall Definition matches the recursion-structure 
of Lemma {8.1} and we obtain an upper bound on the min-var-degree for minimally 
unsatisfiable clause-sets: 


Theorem 8.3 For all k © N and F € MUs<x we have pvd(F’) < nM(k). More 
precisely, for n(F’) > 0 there exists a variable v € var(F’) with vdr(v) < nM(k) and 


Proof: The assertion is known for k = 1, so assume k > 1, and we apply induction 
onk. Assume 6(F’) = k (due tok > 1 we have n(F’) > 1). Saturate F and obtain F’. 
Consider a variable uv € var(F”’) realising the min-var-degree of F’. If vde(v) = 2 
then we are done, so assume vdp(v) > 3. Let i := max(ldp(v),lde(0)); so 
vdp(v) < 2%. W.l.o.g. assume that 7 = ldp/(v). By Lemma we gett 2 <i<k. 
Applying the induction hypothesis and Lemma B.]] we obtain a variable w € var(G) 
for G := (v > 1)*F with vde(w) < nM(k—i+1). By definition we have vdr(w) < 
vd¢(w) + ldr(v). Altogether pvd(F’) < min(2i,nM(k —i+1) +7) < nM(k). 
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The upper bound on the minimum variable degree of Theorem is not sharp, 
and will be further investigated from Section |13 on. However the bound is attained 
for infinitely many deficiencies, and we demonstrate now that the jump positions 
(the set J; recall Definition |7.12)) are such deficiencies. Moreover, to investigate the 
remaining deficiencies, we show that they always have at least two variables realising 
the bound (if the bound is attained at all); this will be used to prove Theorem (14.3) 
So we consider “extremal” F € MUs=x with pvd(F’) = wnM(k), and we show that 
such extremal clause-sets have at least two different variables of minimal degree, 
ifk ¢ J. First, it is useful to have a notation for the set of variables of minimal 
degree: 


Definition 8.4 For F € CLS let varpva(F) C var(F’) be the set of variables of 
minimal degree, that is, vatyya(F’) := {v € var(F) : vdr(v) = pvd(F)}. 


Obviously varpva(E’) A @ iff n(F) > 0, and varyva(F’) = var(F) holds iff F’ is 
variable-regular. 


Lemma 8.5 Consider k EN. 
1. Fork ¢ J and F € Musex, with pvd(F’) = nM(k) we have |varpva(F’)| > 2. 
2. Fork € J there is F €C UHL T5=~ with wvd(F) =nM(k) and |varpva(F)| = 1. 


Proof: First assume k ¢ J; we have to show the existence of different_v,w € 
val va(F'). W.lo.g. F is saturated. Consider v € pvd(F). By Corollary F.2d we 
have nM(k) > 2+ inm(k) — 1. Because of ldp(v) + ldp(¥) = nM(k) thus w.Lo.g. 
e; := Idp(v) > inm(k). Let F’ := (uv > 1) * F. So 0(F") = k-—e, +1. Recall 
nM(k) = nM(k — inm(k) + 1) + inm(k) (Lemma (7.10, Part B), and thus nM(k) > 
nM(k — e; +1) +e. Since n(f’) > 2, we can consider w € var,va(F”). We have 
vdr(w) < nM(k — e; +1) and vdr(w) = vdr-(w) + e1. Thus w € varyya(F). 
Now assume k € J, that is, k = 2™+! —-m-—2 for m > 1. For k = 1 we 
have the example {1,—1}, so assume k > 2. Then we have nM(k) = 2+! — 2. 
Now we obtain an example from A,,41 by performing one strict full subsumption 
resolution: The resolution variable occurs 2+! — 2 times, the other m—1 variables 
occur 2+! — 1 times. 


In Lemma [12.11} we formulate the sharpness of the upper bound of Theorem 
for these cases. 


9 The min-var-degree upper bound for LEAN 


In this section we prove Theorem 0.8} the upper bound nM(k) on the min-var- 
degree for lean clause-sets of deficiency k, and the sharpness of this upper bound 
for any class between VMU and LEAN in Theorem 9.12, The proof consists in 
lifting Theorem to the general case in Subsection |9.2, while in Subsection 
we introduce the auxiliary class SED of clause-sets, where deficiency and surplus 
coincide; Lemma there shows that unsatisfiable elements of SED are variable- 
minimally unsatisfiable. Sharpness of the upper bound in considered in Subsection 


0.3. 


9.1 Clause-sets with extremal surplus 


We consider the task of generalising Theorem B.3 to F € LEAN. Consider an 
arbitrary (multi-)clause-set F'. Consider a set of variables ) 4 V C var(F) realising 
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the surplus of F, i.e., such that 5(F[V]) is minimal (recall Definition B.19). If F[V] 
would be satisfiable, then a satisfying assignment would give a non-trivial autarky 
for F. Assuming that F is lean thus yields that F'[V] must be unsatisfiable. So there 
exists a minimally unsatisfiable F’ C F[V]. If now var(F”) 4 var(F[V]) = V would 
be the case, then we would loose control over the deficiency of F’. Fortunately 
this can not happen, as we will show in Lemma (9.5 To understand this result, 
the following class of clause-sets with maximal surplus (relative to the deficiency) 
is important. 


Definition 9.1 Let the class SED C CLS (“surplus equal deficiency”) consist of 
those clause-sets F € CLS with o(F) = o(F). 


It seems the class SED crosses the classes considered in this report in interesting 
extremal cases. 


Example 9.2 Some basic examples: 


1. We have T € SED and {1} € SED, and more generally, for every F € SED 
we have L ¢ F. 


2. For a clause C € CL we have 6({C}) = 1— |C| and o({C}) = 0, and thus 
{C} € SED & |C| = 1. 


3. For F := {{1}, {2}} we have o(F) = 6(F) =0, and thus F € SED. However 
for the multi-clause-set F’ := {2 « {1}, {2}} we have 6(F’) = 1, while still 
o(F") = 0, and thus F’ ¢ SED. 


. Another example for F € SED with 6(F) =0 is F := {{1, 2}, {-1, 2}}. 
. An € SED for n > 1 (Example [2.29). 

. Musar \ {{L}} C SED (since for F € MU \ {{L}} holds o(F) > 1). 
. Fn € SED for n > 2 (Example [8.4). 


. For F := {{1,2,3}, {1,2, 3}, {1, -2}, {-1,2}, {-1, -2}} we have F © MU 
with 6(F') = 2, but o(F) =1, and thus F ¢ SED. 


SC RP BD aA aR 


9. In Definition we introduce the subclass MLCR C SEDN SAT, and 
Example [10.4 shows elements of this class. 


10. See also Example and Question (10.14, 


Finally we note that F € SED if F’ © SED, where F" is the multi-clause-set 
obtained from F by forgetting all signs of the literals, t.e., replacing clauses C' € F 
by var(C) (since 6(F’) = 0(F’) and o(F") = o(F)). 


The corresponding class SED of multi-clause-sets is not invariant under multi- 
plicities; consider a multi-clause-set F’ and the underlying clause-set F”’: 


1. If F’ € SED, then in general we do not have F € SED (Example (9.9). 


2. However in general holds F € SED = F" € SED, since if we would have 
F’ ¢ SED, then o(F”’) < 6(£"), and adding a duplicated clause to a multi- 
clause-set increases 6 by +1, while o is at most increased by +1 (it may also 
stay unchanged). 


A simple but instructive equivalent formulation of SED is as follows, which also 
yields a stronger corollary than the above “F € SED => F’ € SED”: 
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Lemma 9.3 For a multi-clause-set F we have F € SED if and only if for all 
0 CV C var(F) and for the sub-multi-clause-set FY < F consisting of all C € F 
with var(C) C V (with the same multiplicities) we have c(F“) < |V]. 


Proof: For arbitrary F € CLS and every § C V C var(F’) we have 
c(FV]) + cP PW) = e(F), 
and thus we get, using V’ := var(F’) \ V: 


6(F[V]) 2 O(F)  c(F[V]) — |V| 2 c(F) -—n(F) © 
c(F) — c(FY" FV) — |V| > c(F) —n(F) & (FY) < |V"|. 


These V’ run through all @ C V’ C var(F). 


We remark that c(F®) <0 = L ¢ F. We obtain as an immediate corollary, 
that decreasing multiplicities in F € SED does not leave this classes (even if the 
multiplicity drops to zero): 


Corollary 9.4 For F € SED and F' < F we have F’ € SED. 


Unsatisfiable elements of SED have a strong structure: 


Lemma 9.5 SEDNUSAT CVMU (and thus SEDNUSAT C VMU). 


Proof: Consider F € SEDNUSAT, and assume there is an unsatisfiable F’ C F 
with var(F’) C var(F); consider a minimally unsatisfiable sub-clause-set F” C F’. 
By definition we have for V := var(F) \ var(F”) 4 0: 


6(F") = e(F") —n(F") = cP") — (n(P) — n(F[V]) < 
(c(F) — c(F[V])) — (n(®) — n(F[V]) = 6(F) — 6(FIV)) < o(F) — o(F) =0 
contradicting 6(F”) > 1 (since F” € MU). Thus we have SEDNUSAT C VMU. 
An example of F € MU \ SED is given in (Example Finally consider F € 
SEDNUSAT: thus for the underlying clause-set F’ holds F’ € SEDNUSAT, 
whence F’ € VMU, and thus F € VMU. 


We conclude this subsection by considering the complexity of SAT decision for 
F € SED 5=» for parameter k € N. By Lemma we could use Theorem 4-7, how- 
ever we have SEDs—, C MLEAN, and thus we can apply the fpt-result discussed 
in Example and thus SAT decision for inputs in SED 5=,» is fpt in k. 


Question 9.6 Can SAT decision for SED be done in polynomial time? If so, can 
we also find a satisfying assignment quickly? 


9.2 The generalised upper bound 
Back to the main task, the central lemma, utilises Lemma to show that from 


extremal F'[V] we obtain variables of low degree for F itself: 


Lemma 9.7 Consider a multi-clause-set F and) Cc V C var(F) such that F[V] is 
unsatisfiable and o(F[V]) = 6(F[V]) > 1 (whence F[V] € SEDN MLEAN). Then 
there exists v € V with vdr(v) < nM(6(F[V])) and ldpr(v),ldr() < 6(F[V]). 


57 


Proof: Let F’ := F[V] and consider some minimally unsatisfiable F” C F’. By 
Lemma p.5] we have var(F”’) = var(F’). So we get 6(F”) = 6(F"’) — (c(F’) —c(F")). 
By Theorem |8.3 there is v € var(F”) with 


vdyu(v) < nM(6(F")) = nM(O(F") — (c(F') — e(F"))) S 
nM(6(F")) — (c(F") — e(F")) 


and Id (v),ldev (0) < 6(F”) = 6(F") — (c(F’) — c(F'")). Finally we have vdr(v) < 
vdrv(v) + (c(F") — c(F”)) (note that all occurrences of v in F are also in F’), and 
similarly for the literal degrees. 


We are ready to show the generalisation and strengthening of Theorem [8.5 


Theorem 9.8 We have pivd(F') < nM(o(F)) for a lean multi-clause-set F with 
n(F) > 0. More precisely, there exists a variable v € var(F') with vdr(v) < 
nM(o(F’)) and ldp(v),ldr(v) < o(F). 


Proof: Recall that F' is a lean multi-clause-set with n(f’) > 0, and we have to show 
the existence of a variable v with vdr(v) < nM(a(F)) and ldp(v),lde(0) < o(F). 
Consider 0 # V C var(F) with 6(F[V]) = o(F), and let F’ := F[V]. F” is 
unsatisfiable, since F' is lean. Because of 6(F’) = o(F’) we have 6(F’) = a(F”). So 
we can apply Lemma (9.7. 


Since for a variable v € var(F’) for any F' € CLS holds 
5(F{{v}]) = vde(v) 1 > pvd(F) — 1 > oF), 
and the surplus is a lower bound for the deficiency, we get: 


Corollary 9.9 For a lean multi-clause-set F', n(F’) > 0, we have 


o(F)+1< pvd(F) < nM(o(P)) < o(F) +14 fld(o(F)) 
pvd(F’) <nM(d(F)) < 6(F) +1+ fld(6(F)). 


That the bounds from Corollary p.9} are sharp in general, is shown by the fol- 
lowing examples. 


Example 9.10 First we consider any lean clause-set F # T, and perform a non- 
strict full subsumption extension F ~ F’. Obviously F’ is lean as well (with 6(F’) = 
O(F)). Then we have uvd(F") = 2 and o(F’) = 1, and thus 


2=0(F’) +1 = pvd(F’) = nM(o(F"’)) = o(F) +14 fld(o(F)), 


while 6(F") is unbounded. This construction will be taken up again in Lemma 
Now we turn to the 6-upper bounds. For n > 2 consider A,. We have o(An) = 
d(Ay) = 2” —n by Example (2.24. Thus here the inequalities of Corollary are 


2” —n+1=a(A,)+1< 
2” = pvd(An) = nM(6(An)) = 6(An) +14 fld(5(An)) 


(using Corollary (7.24). 
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9.3 Sharpness of the bound for VMU 


We now show that for every deficiency k there are variable-minimally unsatisfiable 
clause-sets where the min-var degree is nM(k) (strengthening Example (9.10). The 
examples are obtained as follows: 


Lemma 9.11 For a clause-set F € CLS, | ¢ F and n(F) > 0, with at least one 
full clause, consider the following construction of F’ € CLS: 


1. Let C be a full clause of F. 

2. Let F” be a full singular unit-extension of F (recall Definition (5.14). 

3. Let F’ := F" w{C}. 
We have the following properties: 

1. of BF") =o0(F) +1, 0(F"’) = 6(F) +1, pvd(F") = pvd(F) + 1. 

2. F €SED= F' € SED. 

3. F €EUSAT => F! EUSAT. 
Proof: With Lemma we get o(F”) = o(F), 6(F") = 6(F), pvd(F”) = 
pvd(F). Obviously 6(F’) = 6(F”) +1. Let var(F”) \ var(F) = {v}. To see 
pvd(F"’) = pvd(F”)+1, we note that for w € var(F’) we have vdr(w) = vde (w)+ 
1, while vdp (v) = c(F’) —1 =c(F"”) > vdrv(w) +1. 

To prove o(F’) = o(F”) + 1, we consider @ C V C var(F’) = var(F”). If 
v € V, then 6(F"[V]) = 6(F”[V]) + 1, since C is full for F. If V = {v}, then 
O(F'(V]) = 6(F"[V]) = c(F”) > o(F”) +1. Finally, if V > {v}, then 6(F"[V]) = 
c(F") — |V| > 6(F’) = 6(F") +1>0(F”") +1. 

The implication F € SED => F’ € SED follows now by definition of SED, and 
F EUSAT = F' CUSAT is trivial. 


With the construction of Lemma p.11]we now show that the general upper bound 
on the min-var degree of lean clause-sets is tight for variable-minimally unsatisfiable 
clause-sets: 


Theorem 9.12 For a class VMUN SED CC C LEAN andk € N we have 
pvd(Cs=x) = nM(k). 


Proof: By Theorem it remains to show the lower bound pvd(VMUszx M 
SED) > nM(k). For deficiencies k = 2" —n, n € N we have nM(k) = 2”, and 
thus A,, serves as lower bound example (as shown in Example ).1Q), while until the 
next jump position we can use (aa together with Lemma (9.5, where due 
to Corollary in this range also nM increases only by 1 fork ~~ k+1. 


Using Lemma 4.3, we can now determine the min-var degrees for the classes 
LEAN and VMU, separated into layers via deficiency or surplus: 


Corollary 9.13 For k € N holds nM(k) = pvd(L€EANsex) = pvd(LEANG=r) = 
pvd(VMUsex) = upvd(VMU,=z)- 
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10 Algorithmic implications 


In Subsections we consider the algorithmic implications of Theorem 0.3, 


First in Theorem [10.3 we show that via an autarky-reduction every clause-set F’ € 
CLS can be reduced to some F’ C F, where F” fulfils the min-var-degree upper 
bound of Theorem (although F’ might not be lean). For this autarky-reduction 
we do not know whether we can efficiently compute a certificate, the autarky, and 
we discuss the Conjecture that efficient computation is possible, in Subsection 
f10.9. We conclude with some remarks on the surplus in Subsection (10.3), 


10.1. Autarky reduction 


By Theorem 9.9} lean clause-sets fulfil a condition on the minimum variable-degree — 
if that condition is not fulfilled, then there exists an autarky. In this section we try 
to pinpoint these autarkies. We consider a vast generalisation of lean clause-sets, 
namely matching-lean clause-sets (recall Subsection b.d, especially that a multi- 
clause-set F’ with n(F') > 0 is matching-lean iff o(F’) > 1). It is also useful to note 
here the observation that a (multi-)clause-set F' has a non-trivial autarky (is not 
lean) iff there is 9 C V C var(F) such that F'[V] is satisfiable, and the corresponding 
autarky reduction of F' removes all clauses containing some variable of V; note that 
to perform this autarky reduction the autarky itself (the satisfying assignment for 
F{V]) is not needed, only its set V of variables. 

We obtain a sufficient criterion for the existence of a non-trivial autarky by 
considering the converse of Theorem 


Lemma 10.1 Consider a matching-lean multi-clause-set F with n(F) > 0. If we 
have pvd(F) > nM(o(F)), then for all F’ := F[V] with @ C V C var(F) and 
b(F[V]) = o(F) we have: 

1. 6(F") = o(F"’) = o(F) (so F’ € SEDN MLEAN ). 

2. wvd(F") > nM(o(F’)). 

3. F’ © SAT (thus there is p € PASS with var(y) = V and yx F’ = T; this p 


is a non-trivial autarky for FE’). 


Proof: Part |l] follows by definitions. For Part B| note that pvd(F’) < nM(o(F”’) 
implies pvd(F’) < pvd(F") < nM(o(F)) contradicting the assumption. And Part 
follows now by Lemma 9.7. 


To better understand the background, we recall two fundamental facts regarding 
the surplus o(F’) for multi-clause-set F’ with n(F’) > 0: 


1. o(F) together with some § Cc V C var(F) with o(F) = 6(F[V]) can be 
computed in polynomial time (see Subsection 11.1 in ) 


2. If o(F’) < 0, then one can compute a non-trivial matching autarky for F' in 
polynomial time (see Section 7 in or Section 9 in 67). 


We see now that we can reach the conclusion of Theorem p.9| for arbitrary 
inputs F' in polynomial time, via some autarky reduction (maintaining satisfiability- 
equivalence): 


Theorem 10.2 Consider a multi-clause-set F. We can find in polynomial time a 
sub-clause-set F’ C F such that: 


1. There exists an autarky yp for F with F' = px F. 
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2. If n(F’) > 0, then o(F") > 1 and pvd(F") < nM(o(F’)). 


Proof: First F' is reduced to the underlying clause-set (and all further compu- 
tations only handle clause-sets), and if | € F, then we reduce F to {1} and 
are finished. Otherwise, the reduction process of F' yielding the final F’ consists 
now of a loop of two steps: First eliminate matching-autarkies (i.e., compute the 
matching-lean kernel), such that we reach o(F') > 1. Then apply the autarky- 
reduction according to Part BI of Lemma (removing all clauses containing a 
variable of V) in case of uvd(F’) > nM(o(F)). This loop is aborted if T is reached 
or the criterions no longer applied. All autarkies are composed together (as shown 
in bd), also in general the composition of autarkies is again an autarky), yielding 
the final vy. 


In Theorem we can only show the existence of an autarky » for F with 
F' = yx F, however we currently do not know how to compute it efficiently. We 
conjecture that it can be found in polynomial time: 


Conjecture 10.3 For F € CLS,>, there is a poly-time algorithm for computing a 
non-trivial autarky yp for F in case of uvd(F) > nM(a(F)). 


Note that we ask only to find some autarky y, not necessarily one given by Lemma 
(i.e., with var(y) = V as in Part [J of Lemma {10.1}. That this is enough follows 
by the fact that the number of variables is reduced by such a reduction, and this 
by some autarky: 


Lemma 10.4 If Conjecture[10.J is true, then for the algorithm from Theorem (70.4, 
which reduces a multi-clause-set F' to some (satisfiability-equivalent) F’ C F, we 
can also compute an autarky y for F with F' = yx F in polynomial time. 


Proof: In the loop as given in the proof of Theorem (10.2), we can replace the 
autarky-reduction according to Part B} of Lemma by the reduction F ~ y * F 
according to a (non-trivial) autarky as given by Conjecture f10.4. 


In the subsequent Subsection we discuss what we know about Conjecture 


10.2 On finding the autarky 


Consider a matching-lean multi-clause-set F' with n(F) > 0, where Lemma is 
applicable (recall that we have o(F') > 1), that is, we have pvd(F) > nM(o(F)). 
So we know that F’ has a non-trivial autarky. Conjecture states that finding 
such a non-trivial autarky in this case can be done in polynomial time (recall that 
finding a non-trivial autarky in general is NP-complete, which was shown in [51]). 

The task of actually finding the autarky can be considered as finding a satisfying 
assignment for the following class MLCR C SATNMLEAN of satisfiable(!) multi- 
clause-sets F’, obtained by considering all F[V] for minimal sets of variables V with 
o(F[V]) = o0(F) (where “CR” stands for “critical” ): 


Definition 10.5 Let MLCR be the class of clause-sets F fulfilling the following 
three conditions: 


1. PE MLEAN, LéF, FAT. 
2. For all@ CV C var(F) holds 6(F[V]) > o(F). 
3. pvd(F) > nM(o(F)). 
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(The definition of MLCR just uses F € MLEAN instead.) 


The basic properties of this class are collected in the following lemma: 


Lemma 10.6 For F € MLCR holds: 
1. 0(F) =o(F) > 1 (whence F € SED). 
2. F € SAT. 


Proof: Since F ¢ MLEAN and n(F) > 0, we have o(F) > 1. By L ¢ F we get 
F = F\var(F)], and thus o(F) = 6(F[var(F)]) = 6(F), while F € SAT follows by 
fo] 


Lemma |10. 1} 


The examples we know for elements of MLCR are as follows: 
Example 10.7 A simple example for F € MLCRsa1 N HIT is given by 


F := {{1, 2}, {-1, 2, -3}, {-2, 3}, {1, -2, —3}}. 


We have 6(F') = 4—3 = 1 and pvd(F) = 3; for o(F) = 1 and Condition |g of 
Definition notice, that any two variables cover all four clauses, and thus the 
minimum of 6(F[V]) is only attained for V = var(F); finally by pvd(F’) = 3 > 
nM(1) = 2 we get F € MLCR, while F © HIT by definition (any two clauses have 
a clash). 

This example also shows that MLCR is not invariant under multiplicities: Ob- 
tain F" from F by duplicating the first clause. We still have F’ € MLEAN, but 
b(F") = 2, while 6(F’[{3}]) = 2 as well, and thus F’ € MLCR. So duplicating a 
clause can lead outside of MLCR. In the other direction, removing a duplication, 
we have the following simple (counter-)erample: Let F := {2 * {1}}; trivially we 
have F € MLCR, but for the underlying clause-set we have {{1}} ¢ MLEAN, 
thus it ts not in MLCR. 

A more general class of example is obtained by full clause-sets. Let F be a full 
clause-set and n:=n(F), m:=c(F). Then F € MLCR iffn<m< 2”: 


1. We have 6(F) =m—n and thus (F) >1lam>n. 
2. Furthermore F € SAT =m < 2”. 


3. For@ CV Cc var(F’) we have 6(F[V]) = m—|V|. Thus o(F) = 6(F), and 
Condition J of Definition is fulfilled. 


4. It remains to show the condition on the min-var degree: We have pvd(F’) = m, 
while nM(o(F’)) = nM(m—n). By Theorem we obtain nM(m — n) = 
m—n- fld(m—n+1+4 fld(m—1n+1)). We obtain forn > 1: 


pvd(F) > nM(o(F)) @ m>m—n+fid(m—n+1+4+ fld(m—n+1))e 
fld(m —n+1+4 fld(m—-—n+1)) <n <= 
fld(2” —-1l—n+1+4+fld(2”-1l-n+1))<ne 
fld(2” — n + fld(2” — n)) = fld(2” — n+ (n — 1)) = fld(2” — 1) <n. 


Finally we note that the class MLCR is invariant against changes of polarities 
of literal occurrences (if clauses become equal in this way, then their multiplicities 
have to be added), and thus for example replacing all clauses C € F € MLCR 
by their positive forms, var(C), we obtain a positive (no complementations occur) 
multi-clause-set F’ € MLCR (with c(F’) = c(F) and n(F") = n(F)). 
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The importance of MCCS is, that it is sufficient to find a non-trivial autarky for 
this class of satisfiable clause-sets. In order to show this, we need to strengthen the 
polytime computation of o(F): 


Lemma 10.8 For a multi-clause-set F with n(F) > 0 we can compute in polyno- 
mial time a minimal subset 0 C V C var(F) with 6(F[V]) = o(F). 


Proof: Let V := var(F’). Check whether there is v € var(F’) with o(F[V \ {v}]) = 
o(F’) — if yes, then V := V \ {v} and repeat, if not, then V is the desired result. 


We are ready to show that MCCS is really the “critical class” for the problem 
of finding the witness-autarky underlying the reduction F ~+ F” of Theorem 


Theorem 10.9 Consider F € CLS with o(F) > 1 and pvd(F) > nM(o(F)). 


1. For every minimal subset 0 C V C var(F) with 6(F[V]) = o(F) we have 
F{V] € MLCR. 


2. Thus we can compute in polytime some 0 C V C var(F’) with F[V] © MLCR. 


&. So Conjecture is equivalent to the statement, that finding a non-trivial 
autarky for clause-sets in MLCR can be achieved in polynomial time. 


Proof: Part fl] follows with Lemma fL0. 1. Part P| follows from Part |l| with Lemma 
f10.9. Part Bj follows with Part A (note that every autarky for some FV] yields an 
autarky for F). 


Since MLCR C SED, if both questions of Question b.q have a yes-answer, then 
this would prove Conjecture {10.3 
10.3. Final remarks on the surplus 


It is instructive to investigate the precise relationship between minimum variable- 
degree and the surplus for lean clause-sets, which by Corollary bg are indeed very 
close. Small values behave as follows: 


Lemma 10.10 Consider F € LEAN \ {T} (so o(F) > 1 and pvd(F) > 2). 
1. o(F) =1 holds if and only if pvd(F) = 2 holds. 

= 2; 

€ {3,4}. 

4. pvd(F) = 4 implies o(F) € {2,3}. 


) 
2. pvd(F’) = 3 implies o(F) 
3. o(F’) = 2 implies pvd(F) 


Proof: First consider Part fil If o(F) = 1 (so n(F) > 0), then by Theorem 
we have pvd(F) < nM(1) = 2, while in case of uvd(F’) = 1 there would be a 
matching autarky for F’. If on the other hand pvd(F') = 2 holds, then by definition 
o(F) <2—1=1, while o(F) > 1 holds since F' is matching lean. For Part note 
that due to o(F’) +1 < pvd(F) we have o(F’) < 2, and then the assertion follows 
by Part (i Part hy follows in the same way. Finally Part |3 follows by Part fi] and 
nM(2) = 4. 


For some examples we use F' with 6(F') = o(F): 


Example 10.11 Examples for cases o(F) € {2,3} in Lemma |10.10: 
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1. An example for uvd(F) = 4 in Part|J with F EUHTLT SED is given by Ag. 


2. For { {a,b,c}, {a,b,c}, {a, b, c}, {a,b c}, {a, t}, {a,c} } € UHTT N SED we 
have pvd(F’) = 4 and o(F) = 3 (Part [)). 


Question 10.12 is there for every k © N an F © UHTT NOSED with o(F) =k 
and pvd(F) =k +1? 


As we have for MU the levels MUs =, for k = 1,2,..., we can consider for 
LEAN the levels LEAN =» for k = 1,2,.... However, while the levels MUs=, 
as well as LEAN =r all are decidable in polynomial time, already the first level 
LEANs=1 is NP-complete: 


Lemma 10.13 Consider the map E : CLS + CLS, which has E(T) := T, while 
otherwise for F € CLS\ {T} it chooses (by some rule — it doesn’t matter) a clause 
C € F and a variable v € VA \ var(F), and replaces C by CU{vu}, Cu{v}; in 
other words, an non-strict full subsumption extension F ~+ E(F) is performed, as 
in Example (9.14. Then we have for F € CLS: 


1. FE LEAN iff E(F) € LEAN. 
2 FeEMUY iff E(F)€ MU. 
3. o(F) <1. 
Thus LEAN,=1 is coNP-complete, while MU, =, is D?-complete. 


Proof: The properties of the map FE are trivial. The completeness-properties 
follow with the coNP-completeness of LEAN (|[b1]) and the D?-completeness of 


MU (RQ). 


With Lemma we also get easy examples for minimally unsatisfiable clause- 
sets of arbitrary deficiency and surplus 1. 


11 Matching lean clause-sets 


In this section, which concludes our considerations on generalisations (beyond MU), 
we consider the question whether Theorem p.9 can incorporate non-lean clause-sets. 

We consider the large class MLEAN of matching lean clause-sets, which is 
natural, since a basic property of F € MU used in the proof of Theorem is 
O(F) > 1 for F 4 T, and this actually holds for all F €e MLEAN. We will construct 
for arbitrary deficiency k € N and K € N clause-sets F € MLEAN of deficiency 
k, where every variable occurs positively at least K times. Thus neither the upper 
bound max(ldr(v),lde(@)) < f(6(F)) nor ldr(v) + lde() = vdr(v) < f(6(F)) for 
some chosen variable v and for any function f does hold for MLEAN. 

An example for F € MLEAN5=1 with pld(F) > 2 (the minimal literal degree; 
and thus pvd(F’) > 4) is given in Section 5 in 64, displaying a “star-free” (thus 
satisfiable) clause-set F’ with deficiency 1. In Subsection 9.3 in ba it is shown that 
this clause-set is matching lean. “Star-freeness” in our context means, that there 
are no singular variables (occurring in one sign only once). Our simpler construction 
pushes the number of positive occurrences arbitrary high, but there are variables 
with only one negative occurrence (i.e., there are singular variables). 

For a finite set V of variables let M(V) C A(V) be the full clause-set over 
V containing all full clauses with at most one complementation; e.g. M({1,2}) = 


{{1, 2}, {=1, 2}, {1, =2}}: 
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1. Obviously n(M(V)) = |V|, cLM(V)) = |V| +1 and 6(M(V)) = 1 holds. 


2. We have already seen that M(V) € MLEAN (for T #4 F’ C FC A(V) we 
have 6(F”) < 6(F), and thus a full clause-set F' is matching lean iff 6(F’) > 1). 


3. By definition we have Idyg(y)(v) = |V| and Idyg(y)(8) = 1 for v EV. 


Lemma 11.1 Fork © N and K EN there are F € MLEAN5=% with F € USAT 
for k > 2 such that for all variables v € var(F’) we have ldr(v) > K. 


Proof: For k = 1 we can set F := M({v1,...,vK}); so assume & > 2. Consider 
any clause-set G € MUj=,-1 with n := n(G) > K, and let V := var(G). Consider 
a disjoint copy of V, that is a set V’ of variables with V’N V = @ and |V’| = |V|, 
and consider two enumerations of the clauses M(V) = {Ci,...,Cn+i}, M(V’) = 
{Ch ss++y Ch aa fs Now 


F:=Gu{QjuC:ie {l,...,.n+1}} 


has no matching autarky: If y is a matching autarky for F, then var(y) NV = 9 
since G is matching lean, whence var(y) NV’ = @ since M(V’") is matching lean, and 
thus y must be trivial. Furthermore we have n(F’) = 2n and c(F) = c(G)+n+1, 
and thus 6(F’) = c(G) +n +1 -2n = 6(G) +1=k. By definition for all variables 
vu € var(F’) we have ldp(v) > n. 


For k = 1 the examples of Lemma|L1.]] for K > 3 are necessarily satisfiable, since 
MLEAN 521 NUSAT = MUs=1. It remains the questions whether the singular 
variables can be eliminated: 


Question 11.2 Are there examples for deficiency k © N of FE MLEANS=x with 
pld(F) >k+1 ? 


1. The above mentioned star-free clause-sets shows that this is the case for k = 1. 


2. What about the stronger condition pld(F') > K for arbitrary K EN ? 


12 Lower bounds for the min-var-degree of MU 


We now return to (boolean) minimally unsatisfiable clause-sets. By Theorem 
we have pvd(MUs=,) < nM(k) for all k € N. The task of precisely determining 
pvd(MUs=,) for all k seems a deep question, and is the subject of the remainder 
of this report. First we introduce a notation for the true bound on pvd(MUs=x): 


Definition 12.1 For k € N let unM(k) := pvd(MUuse,) € N (the “minimum 
non-Mersenne number” for (deficiency) k). 


So by Theorem we have wnM < nM. All our examples yielding lower bounds 
on pnM(k) are actually (unsatisfiable) hitting clause-sets, and thus we believe 


Conjecture 12.2 For all k € N holds unM(k) = uwvd(UHTT5=x). 


We will see in Theorem that unM ~nM. We believe that wnM is a highly 
complicated function, but the true values deviate only at most by one from nM: 


Conjecture 12.3 For allk © N we have unM(k) > nM(k) — 1. 
By Corollary we get: 
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Lemma 12.4 If Conjecture holds, then k —1+ fld(k +1) < pnM(k) < k+ 
1+ fld(k) holds fork EN. 


Later in Lemma we will see that ynM : N > N is monotonically increasing. 
While in Theorem we will (implicitly) construct a correction function 7, : N > 
{0,1} such that wnM < nM —4, where we remark in the Conclusion (Section 
that also uynM 4 nM —7, holds. Note that Conjecture says that there exists 
y:N > {0,1} with unM = nM —+, while for every y : N > {0,1} the function 
nM —7 is still monotonically increasing (by Lemma [7.5)), and is thus a possible 
candidate. 

In Subsection we provide a general method for obtaining lower bounds, via 
considering full clauses (while in Section |13} we turn to improved upper bounds). 
Namely we introduce vfc(MUs=.) € N for k € N, the maximal number of full 
clauses in F € MUs_,. According to our numerical investigations the number of 
full clauses is very close to wnM, and indeed to nM(k): 


Conjecture 12.5 For all k © N we have vic(MUs=x) > nM(k) — 1. 


Conjecture implies Conjecture regarding vic(/HTTs=x,), there might be 
unbounded gaps to vfc(MUs=,). The smallest deficiency & with vic(MUs=.) = 
nM(k) — 1 (and also vic(MUs=,) = pnM(k) — 1) is k = 3, as shown in Lemma 
(together with Theorem 14.3). We show for two infinite classes of deficiencies 
k that vic(UHITs—x) = pnM(k) = nM(k) holds (Lemmas [12.10, [12.11). Actually, 
the main point here could be considered as (just) the equalities unM(k) = nM(k), 
for which in these two cases the proofs don’t need to consider full clauses, and so the 
general method for computing lower bounds on vfc(MUs=,), with the beginnings 
developed in Subsection (1.2.2), is not applied here. However in future work we will 
employ this method more fully (see Subsection |L5.3)), and, more important for the 
report at hand, we need for the proof of Theorem (that unM(6) = nM(6)—-1 = 
8) the fact vic(MUs=3) < 4, shown in Lemma [12.16 


12.1 Some precise values for the min-var-degree of MU 


A general lower-bound method for unM is provided by the number fe(F’) of full 
clauses in a clause-set F. The supremum vfc(MUs5_;,) of this number over all el- 
ements of MU;_, for fixed k is an interesting quantity in its own right, but in 
this report we only touch on this subject, providing the bare minimum of informa- 
tion needed in our context. See Subsection for an outlook on the interesting 
properties of this quantity. 


Definition 12.6 For a clause-set F € CLS let fc(F) € No be the number of full 
clauses, that is fc(F) := |{C € F : var(C) = var(F)}|. And for a class C C CLS 
of clause-sets we define fe(C) := {fc(F) : F € C} C No as the set of all possible 
numbers of full clauses, while vic(C) € No U {+00} is the supremum of fc(C). 


Some simple examples: 


Example 12.7 fce(T) = 0, fe({L}) =1, and fe({{1}, {-1, 2}}) =1. While fc(@) = 
0, thus vic(@) = 0, and fe(CLS) = No, thus vic(CLS) = +00. 


By definition we have: 


Lemma 12.8 fc(F’) < pvd(F) holds for every F € CLS (since every variable in F 
has degree at least fc(F)), and thus vfc(C) < pvd(C) for every C CCLS. 
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We obtain that for lean clause-sets (especially minimally unsatisfiable clause-sets) 
of fixed deficiency the number of full clauses is bounded: 


Corollary 12.9 A lean clause-set of deficiency k can have at most nM(k) many 
full clauses; i.e., for allk © N we have vic(LEANs=x) < nM(k). 


Precise values for vfic(MUs =.) = pnM(k) we show for two infinite classes of 
deficiencies. The simplest class are the deficiencies directly after the jumps (recall 
Lemma |7.20), the deficiencies of the Ap: 


Lemma 12.10 Forn €N and k := 2” —n holds 


vic(UHTTs=n) = vic(MUs=zr) = pvd(UHTTs=%) = pnM(k) = nM(k) = 2”. 


Proof: We have vfc(A,) = 2” (recall Lemma p.14), and thus vic(UHTT5=%) > 2”, 
while by Corollary we have nM(k) = 2”. 


Also for the jumps themselves (recall Definition and Lemma /7.20)) the same 
conclusions hold, namely by Lemma B.4, Part 2] (and the proof) we have: 


Lemma 12.11 For all k € J holds 
vic(UHTTs=x) = vic(MUs=x) = pvd(UHTTs=~) = pnM(k) = nM(k). 


Note that for k € J there isn € N, n > 2, with k = 2” —n—1 and nM(k) = 
2” — 2. The underlying method of Lemmas (12.10, is simple (as we already 
have explained it in Subsection (1.3): start with A, and apply strict full subsumption 
resolution to full clauses. Zero steps have been used in Lemma one step in 
Lemma [12.1], and one example for two steps will be seen in the proof of Theorem 
[14.1]. The further development of this method we have to leave for future work. 


12.2 On the number of full clauses 


We have a special interest in those F' € MU where the lower bound fc(£’) meets the 
upper bound pvd(F’). In this case this number must be even, and we obtain another 
F' € MU by resolving on any variable realising the minimum variable degree: 


Lemma 12.12 Consider F © MU with fc(F) = uvd(F). Then fc(F) is even. 


Proof: Consider v € var(F’) with vdr(v) = pvd(F’). The occurrences of v are now 
exactly in the full clauses of F. Every full clause C’ must be resolvable on v with 
another full clause D, yielding E := CoD, and thus the full clauses of F' can be 
partitioned into pairs {v}UE, {0} UE (disjoint unions) for fe(F) many clauses E 
(of length n(£’) — 1; note that because of fullness for a given E the clauses C, D are 


uniquely determined up to order). 


Thus, if lower and upper bound match, they must be even numbers: 


Corollary 12.13 If vic(MUs=.) = nM(k) or vic(MUsen) = pvd(MUszx) for 
some k EN, then vic(MUs=%) is even. 


Another property of fe(MUs=x) related to evenness is that if m is a possible 
number of full clauses, then 2m is a possible number for 6 =k+m-—1: 


Lemma 12.14 2m € fe(MUsekim_1) fork © N and me {1,...,vfe(MUs=x)}. 
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Proof: Consider F € MUsz, with fe(F) = vic(MuUs=,). Choose m of the full 
clauses of Ff’, and choose a new variable v ¢ var(F’). Replace each of the chosen full 
clauses C' € F by two clauses C U{v}, CU{d} (one non-strict and m — 1 strict full 
subsumption extensions), obtaining F’. We have F’ € MUs=k4m-—1 and fe(F”) = 
2m. 


As a special case we obtain that 2,4 are always possible for the number of full 
clauses (except for k = 1): 


Corollary 12.15 For k € N holds 2 € fe(MUszx), and if k > 2 then 4 € 
fe(MUs=x). 


Proof: We show the assertion by induction over k, using Lemma (12.14, as follows: 
We have {{1}, {—1}} © MUs=1, so consider k > 2. We know 2 € fe(MUs=x_1), 
thus 4 € fe(MUs=p—-142-1=x). And once we have any F € MUs5=, with a full clause, 
we get F’ € Musee with fc(F’) = 2 by performing a non-strict full subsumption 
extension on that full clause (ie., F’ = E(F’) as in Lemma with C a full 
clause). 


We now turn to the determination of vic(MUs=,) for k = 1, 2,3. 


Lemma 12.16 We have: 
1. vic(MUs=1) = 2 = nM(1). 
2. vic(MUs=2) = 4 = nM(2). 
3. vic(MUsa3) = 4 = nM(3) - 1. 


Proof: Part fi Between two clauses of some F’ € MU;5_, there is at most one 
conflict, and thus there are at most two full clauses in F’. while by Corollary 
we know vfc(MUs=1) > 2. Part Bt By Corollary we have vfc(MUs=2) > 4, 
by Corollary we have vfic(MUs=2) < 4. Part |3: By Corollary we have 
vic(MUs=3) > 4, by Corollary [12.13] we have vfc(MUs=3) < 4. 


13. A method for improving the mvd upper bound 
for MU 


We now present a framework for generalising the argumentation of Theorem 
together with the analysis of the underlying recursion from Section fa. The idea is 
as follows: 


1. We start with upper bounds unM(k) < ax for k = 1,...,p, collected in a 
“valid bounds-function” /f. 


2. For deficiency p+ 1 and an envisaged min-var-degree m we consider the set 
pp;-(p + 1,m) of “possible” degree-pairs of variables (the degrees of the pos- 
itive and negative literals) in an envisaged clause-set F € SMUs=p41 with 
pvd(F’) =m. 


3. If ppr(p + 1,m) = 0, then _m is “inconsistent”, that is, impossible to realise 
(as shown in Theorem f[3.1d), whence pinM(p + 1) < m. 


4. While in case of pp;(p + 1,m) # @ there might exist such an F or not (the 
formal reasoning underlying the definition of pp;(p + 1,m) is not complete). 
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So here we generalise the approach of Section [7 for describing the function nM toa 
general recursion scheme, obtaining a general method for improved upper bounds. 
The applications in this report are as follows: 


e In Theorem |13.15) we obtain an alternative description of nM(k). 


e In Section we will first show that the smallest k, where we don’t have 
equality, is k = 6, namely wvd(MUs=6) = 8 = nM(6) — 1 (Theorem 14.3). 
By the general recursion scheme then follows from this improvement of the 
upper bound, that for all k = 2” —m-+1 for m > 3 we have pvd(MUs=x) < 
nM(k) — 1. This improved upper bound is denoted by by nM; : N > N 


(Theorem fi4.9). 


13.1 Analysing splitting-situations 


“Valid bounds-functions” shall be monotonically increasing — we know that nM 
is (strictly) monotonically increasing, and we show that pvd is also monotonically 
increasing (not strictly, as we will later see in Theorem fi4.4): 


Lemma 13.1 The map unM is monotonically increasing (unM(k) < pnM(k + 1) 
fork EN). 


Proof: For F € MUuszx, n(F) 4 0, we can construct F’ € MUsin41 with 
pvd(F’) < pvd(F") as follows: 


1. If F is full, then obtain a non-full F” € Muse with pvd(F’) = pvd(F”) by 
a full singular unit-extension (Lemma 5.17), and replace F’ by FE”. 


2. If F' is not full, then perform a strict full subsumption extension (Lemma 6.5), 
obtaining the desired F’. 


We define now “valid bounds-functions”, which are sensible as upper bounds on 
pnM, and we also define how to obtain such a function from initial upper bounds 
pnM(k) < ay, fork =1,...,p: 


Definition 13.2 A valid bounds-function is a function f : N > NU {+co} 
fulfilling the following three conditions: 


1. f(1) =2. 
2. f is monotonically increasing (i.e., Vk,k' EN: k < k' => f(k) < f(k’)). 


3. f(k) is an upper bound for the minimal-variable degree of minimally unsatis- 
fiable clause-sets of deficiency k (i.e., Vk EN: unM(k) < f(k)). 


The set of all valid bounds-functions is denoted by VB C NN. And by VB* :={f € 
VB: f <nM} we denote the set of valid bounds-functions (pointwise) less-or-equal 
than the non-Mersenne function. 

For a1,...,@> €N, p€N. such that a, = 2, a; <a; fori <j, anda; > unM(2), 
we define [a1,...,@ | as that f © VB with f(k) = ax for k € {1,...,p}, while 
f(k) =o fork > p. 


By Lemma [13.1|, unM is a valid bounds-function, namely the smallest possible one. 
By Theorem 8.3 and Corollary also nM is a valid bounds-function. In Corol- 
lary we will see, that the continuation of [a1,...,a@p)] by just oo is harmless, 
since nM is automatically taken into account via the improvement of valid bounds- 
functions through the use of potential degree-pairs defined below. 
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Lemma 13.3 VB as well as VB*, together with <, is a complete lattice, where 
infima resp. suprema are given by pointwise minimum resp. pointwise supremum. 
The smallest elements of both lattices is unM, while the largest is [2] resp. nM. 


The following definition reflects the main method to analyse and improve a given 
upper bound f for wnM(k), namely it determines the numerical possibilities com- 
patible with f: 


Definition 13.4 Consider k,m € N with k > 2 and m > 4, together with a valid 
bounds-function f. The set of potential (variable-)degree-pairs w.r.t. f for 
(deficiency) k and (minimum variable-degree) m, denoted by ppz(k,™), is the set 
of pairs (e9,e1) € N? fulfilling the following conditions: 


(1) e0,€1 > 2 
(ti) e0,€1 <k 
(iti) eg tex =m 
(iv) eg < e1 
(vu) Ve € {0,1}: f(kK-ee +1) +e. >m. 
We set pp(k, ™) := pPyn(k,™). 


The motivation for Definition is to assume F € SMUs=% with pvd(F) = m 
and v € varyva(F’), and to determine the possible literal-degrees e9 = Idr(¥), 
€1 = ldr(v), “possible” in a formal sense. “e” stands for “eliminated clauses”, 
namely e, is the number of clauses eliminated by (vu > ¢). The “high” values of m 
(for fixed k) are of real interest; compare Lemma]13.7. The basic properties of pp f 


are as follows: 


1. For every valid f and k > 2 we have pp¢(k,4) = {(2,2)} and pp;(k,m) = 0 
for m > 2k. 


2. Discussion of the five conditions (i) - (v) in Definition [13.4 


(i) Only non-singular variables are considered, since only in this way the 
deficiency strictly decreases. 

(ii) The deficiency of F. := (v > €)* F is ke := k — e- + 1 (assuming we 
split on a variable with minimal degree), and we must have k, > 1. 


(iii) e9,e1 are the literal-degrees of 0, v, which sum up to the variable-degree 
m of v. 


(iv) W.Lo.g. we can restrict attention to such degree-pairs, since F’ plays a 
role only up to isomorphism, and thus one can flip the sign of v in F. 


(v) We have Fz € MUs=p—e.41 (assuming F is saturated). And for w € 
var(F.) we have vdp(w) < vdr.(w)+ee. If for some € € {0,1} we would 
have pvd(MUs=4—e.41) + €e < m, then for w € varyya(F-) we would 
have vdr(w) < vdr.(w) +e. < pvd(MUsan—e.41) + ee < m, but by 
assumption on w we have vdr(w) > m. 


3. An important special case of ppr(k,m) is ppym(k,m); we have pp(k,m) C 
PPrm(k,m) (see Lemma for a generalisation). The main point in using 
functions f is that the precise values of wnM(k) might not be known. 


4. To compute pp;(k,m) according to the definition, only the values f(k’) for 
k’ € {1,...,k —1} are needed. 
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Example 13.5 Consider f := [nM(1),nM(2),nM(3)] = [2,4,5]. First we deter- 
mine pp (4, 7): 


1. By Conditions (i) - (iv) only {(3,4)} remains. 
2. Now f(4—341)+3= f(2)4+3=7, but f(4—44+1) +4= f(l)+4=6 <7. 
3. Thus pp,(4,7) = 90. 


We will see in Theorem |13.1G that from this we can conclude unM(4) < 6 (there is 
no “formal” possibility to reach the min-var degree of 7 for deficiency 4). Now we 
determine pp -(4,6): 


1. By Conditions (i) - (iv), {(2,4), (3,3)} are the possibilities. 


2. Checking Condition (v) for (2,4): f(4—241)+2= f(83)+2=7, f(4—44+ 
l+4=f(l)+4=6. 


3. Checking Condition (v) for (3,3): f(4—3+1)+3= f(2)+3 =7. 
4. Thus pp r(4,6) = {(2, 4), (3,3). 


The intuitive meaning of this is, that a min-var-degree of 6 can not be excluded by 
this type of formal reasoning, and 6 is the first new value according to this reasoning 
for [2,4, 5]. 


We invite the reader to compute the following special case of what we show later 
(in the proof of Theorem it might also be useful to consider Table (2): 


Example 13.6 pp,y,(13,17) = {(8,9)}, while for any valid bounds-function f with 
f(k) =nM(k) fork € {1,...,5} and f(6) = nM(6) — 1 = 8 holds pp, (13, 17) = 0. 


If we have a potential degree-pair for m, then also for m! < m: 


Lemma 13.7 Consider k,m,m’' € N with k > 2 and 4 <m!' <m, and consider a 
valid bounds-function f. If pps(k,m) #0, then also pp;(k,m’) #0. 


Proof: Consider (eo, e1) € pp, (k,m). Consider any 2 < ef < ep and2<ej <e 
with ef < ef and eg +e, =m’. Now f(k-—eL +1) +e > f(k-e-+]1lh+eh= 
f(k-e2+1)+e.—ee+e, > m—e, +e, = m—(m—ezg)+(m'—e,) = eg+m'—-e, > m’ 
for e € {0,1}, and thus (e9,e1) € ppr(k,m’) A 0. 


Using a smaller bounds-function can not yield more potential degree-pairs, as is 


obvious from Definition f13.4: 


Lemma 13.8 Consider k,m € N with k > 2, m > 4, and valid bounds-functions 
fi, fo with fi < fo (pointwise). Then ppp, (k,m) C pp,,(k,m). Especially for any 
valid bounds-function f holds pp(k,m) C pp -(k,m). 


Again directly by definition (using monotonicity of valid bounds functions) we 
get that increasing k while keeping m can not remove potential degree-pairs: 


Lemma 13.9 Consider k,m €N with k > 2, m > 4, and a valid bounds-function 
f. Then pp;,(k,m) C ppy(k + 1,m). 


The main use of potential degree-pairs is to provide upper bounds on pnM(k): 


Theorem 13.10 Consider k,m € N with k > 2, m > 4, and a valid bounds- 
function f. If pps(k,m) =0, then pnM(k) < m. 
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Proof: Assume pvd(MUs=,) > m. Then there is F € SMUjy_, with pvd(F) > m 
(using Corollary B.5). Consider v € varyva(F’); if lde(@) < ldr(v) holds, then let 
e := (Idp(0), ldr(v)), while otherwise flip the components of this pair. Now we have 
e € pp(k, wvd(F’)) (using Remark P| to Definition [13.4), and thus pp¢(k,m) #0 by 


Lemmas [13.7], contradicting the assumption. 


13.2 Recursion on potential degree-pairs 
Via potential degree-pairs and Theorem |13.10, we obtain a method for improving 


valid bounds-functions: 


Lemma 13.11 Consider f © VB. We obtain f’ € VB recursively as follows: 


1. f'(l) :=2. 


2. For k >1 consider the largest 4 < m < 2k such that pp¢, (k,m) 4 0, using 
Remark|4 to Definition (that we only need f'(k') for k’ < k). 


3. Now f'(k) := min(m, f(k)). 


Proof: '(k) is well-defined for k > p due to Remark |I| to Definition That 
f’ is valid follows by induction as follows. We have to show f’(k) < f’(k +1) and 
pnM(k) < f’(k) for all k € N. For & = 1 both properties are true by definition. 
And the induction step follows for monotonicity by Lemma [13.9 and for the upper- 
bound-condition by Theorem {13.10} 


The mapping f € VB f’ € VB we call the “non-Mersenne operator” : 


Definition 13.12 For f € VB let the f' € VB according to Lemma be de- 
noted by NM(f) := f’ (defined via “recursion on potential degree-pairs”); we call 
NM: VB -> VB the “non-Mersenne operator”. 


The basic properties of the non-Mersenne operator are that of alkernel operator, 


which are order-theoretic properties as follows: 


Lemma 13.13 The map NM : VB —> VB is a kernel operator of the complete 
lattice VB, that is, for all f,g € VB holds: 


1. NM(f) < f (intensive) 
2. NM(NM(f)) = NM(f) (idempotent) 
3. f <g=>NM(f) < NM(g) (monotonically increasing). 


Proof: Intensitivity follows by definition of NM (note that in Lemma we 
have defined f’(k) such that f’(k) < f(k) holds). Also idempotence follows directly 
from the definition in Lemma [13.1], namely that f’(k) for k > 1 already uses the 
improved values f’(k’) for k’ < k. Monotonicity follows by Lemma 13.9. 


By Lemma we get that NM(f) for f € VB is the supremum of the set of 
f' < f with NM(f’) = f’. By Theorem [13.10] we get NM(unM) = pnM. In order 
to show that the non-Mersenne operator at most reproduces nM, that is, for all 
f € VB holds NM(f) < nM, we need to provide potential degree-pairs for nM: 


Lemma 13.14 Fork > 2 we have (recall Definition (7.14): 
1. (h(k), in (A) € PPue(k, nM(k)). 
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Proof: Let m := nM(k). For Part fll let €o := h(k), e1 := inm(k); so we have to 
show (eo, €1) € PPyyy(k, m). Consider the conditions (i) - (v) in Definition f13.4) We 
have e9 > 2, since nM > 2 in general, and e; > 2 by Definition 7.8. As shown in 
Corollary we have eg < e1, where e, < k by definition. Furthermore we have 
€9 +e, = m by Lemma (7.14, Part Bl Altogether we have now shown conditions 
(i) - (iv), and it remains to show that nM(k — es +1) +e. > m holds for both 
e € {0,1}; for ¢ = 1 we have equality, as already remarked, and it remains to show 
nM(k — eg +1) + €9 > m, which is equivalent to 


nM(k —eo+ 1) 2 ino (k). 


By Definition of inm(k) (as the smallest 7) this is implied by nM(k — eg + 1) > 
nM(k — nM(k — e9 + 1) +1). By the monotonicity of nM this is implied by eg < 
nM(k — ep +1), ie., nM(kK —igpm(&) +1) < nM(K—e9 +1). Again by monotonicity, 
this is implied by inm(k) > eg, i-e., e: > €9, which we have already shown. 

For Part a we have to show ppyy(k,m+1) = 9. Assume that we have (e€9,e1) € 
PPam(k,m + 1) according to Definition [13.4 Thus we have nM(k — e; +1) +e1 > 
m+1, where 2 < e; < k. Because of eg +e, = m+1 and eg < e1 we get e; > $(m+1), 
whence min(2-e;, nM(k—e,+1)+e1) > m+1, and thus nM(k) > m+1 by Definition 


We obtain an alternative recursion for nM(k) (recall Definition [7.1)): 


Theorem 13.15 NM([2]) = NM(nM) = nM. 


Proof: By Definition and Lemma we get NM(([2]) = nM. Since NM is 
idempotent, we also get NM(nM) = nM. 


So the non-Mersenne operator yields nM in the worst-case: 


Corollary 13.16 NM: VB > VB", that is, for every f © VB holds NM(f) < nM. 


14 Strengthening of the mvd upper bound for MU 


In this final section many techniques introduced in this report come together, and 
we give some initial sharpness results (considering small deficiencies), and some 
non-sharpness results in the form of improved bounds (improving nM for infinitely 
many deficiencies). In Subsection [14.1] we determine unM(k) for 1 < k < 6 as values 
2,4,5,6,8,8, where the main achievement is Theorem (14.3) showing unM(6) = 8 = 
nM(6) — 1. Applying the non-Mersenne operator, we obtain the improved upper 
bound wnM(k) < nMj(k) in Subsection [L4.2} where nM, is like nM, but with a 
duplication after the jump positions, that is, AnM(k) = AnMj)(k) = 2 is followed 
by AnM, (4+ 1) = AnM(kK+1)—-1=0. 


14.1 Deficiencies 1,...,6 


We now show that the first deficiency k, for which the bound pnM(k) < nM(k) is 
not sharp, is k = 6. First we show sharpness for the first five values: 


Theorem 14.1 Fork € {1,...,5} we have unM(k) = pvd(UHTT5=x) = nM(k). 


Proof: We have to give examples showing that the upper bound nM(k) is attained 
for examples in UHITs—;). Lemma |12.10) covers deficiencies k = 1,2,5, namely 
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1. Ay €UHTTs=1 has fe(A1) = 2 = nM(1) (recall Example B.9). 
2. Ay € UHTT5=2 has fc(A) = 4 = nM(2) (recall Example B.9). 
3. As € UHTTs=5 has fc(A3) = 8 = nM(5). 


Deficiency k = 4 is a jump position, and thus covered by Lemma where the 
example is as follows: 


rag cow eee (GF) © Glo] beer) eee oe) a ean eye pe pe 
{-1, —2, —3}} we have Fy € UHTTs<4 with fce(F4) = 6 = nM(4). 


The remaining case k = 3 we obtain via strict full subsumption resolution from Fy: 


b, Bor Fy = 401, 21, fa, 9 8h a, 9 9 8 9) a yh 
we have F3 € UHTTs=3 with pvd(F3) = 5 = nM(3). 


We note that example F3 in the proof of Theorem shows vic(UHZTs=3) = 
vic(MUs=3) = 4 (together with Lemma [12.16{ Part B)). In the sequel of this sub- 
section we consider k = 6. A computation shows that there is only one potential 
degree-pair for the min-var-degree as given by nM(6) = 9: 


Lemma 14.2 pp,y,(6,9) = {(4,5)}. 


Proof: Conditions (i) - (iv) of Definition yield ppyy (6,9) C {(3, 6), (4,5)}. 
Condition (v) excludes (3,6), since we have wvd(MUs=6-6+41) + 6 = 8 # 9, while 
(4, 5) fulfils this condition due to nM(6—4+1)+4 = 5+4 > 9 and nM(6—5+41)+5 = 
4+5>9. 


However, the potential degree-pair of Lemma actually can not be realised, 
and thus pnM(6) < nM(6): 


Theorem 14.3 vic(CUHTT52¢) = vic(MUse6) = pvd(UHTTs=6) = pnM (6) = 8 = 
nM(6) — 1. 


Proof: vfc(“HITs=¢) > 8 is confirmed by the variable-clause matrix 


ae + + —, i 
0 0 
0 O 


+ 


(where unsatisfiability is given by 8-2-4 + 2-2-2 = 1). 

Assume now that there exists F € MUs¢ with pvd(F) = 9. By Lemmas 

; b.4 w.l.o.g. we can assume that F is saturated and non-singular. By Theorem 

we know n(F’) > 4. Consider v € var(F’) with vdr(v) = 9. W.lo.g. we 

assume Idp(v) > ldr(v). By Lemma [14.2] we have Idr(v) = 5, ldr(U) = 4, and 
(Fo) =6-—441=3, 6(Fi) =6-54+1=2. 

Let the 5 occurrences of v in F' be Ch,...,C5 € F, and let Ct := C; \ {uv}. And 
let the 4 occurrences of 0 in F be Di,...,D4 € F, and let Di := D, \ {0}. Using 
G:={CeF:v é€var(C)} =F \{C,...,Cs,Di,..., Da} we get 


Fy = 4Gigo.eG Ue 
Fy = {DiyeccyDAFUG 


where c(Fo) =5+c(G) =c(F) — 4 and c(Fi) =4+c(G) =c(F) —-5. 
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Consider first Fy € MUs=3. We have ppvd(Fo) > 9—4 = 5, and thus pvd(Fo) = 5 
(due to wvd(Fo) < nM(3) = 5). Every variable w € var(Fo) realising the min-var- 
degree of Fp has at least 9 occurrences in F’, from which at most 4 are eliminated, 
and thus actually such variables have vdp(w) = 9, and furthermore w € var(D;) 
for alli € {1,...,4}. By Lemma Part (i, there exist two different variables 
U1, U2 € var(Fo) with vd (uj) = 5 for i € {1,2}, and so we have |D;| > 3 for all 
ae {1,...,4}. 

Now consider Fy € MUs=2. We have pvd(F\) > 9 — 5 = 4, thus pvd(F\) = 4 
(due to wvd(F,) < nM(2) = 4), and thus by Lemma F, is non-singular iff Fy 
does not contain unit-clauses. If F, would contain a unit-clause, then there would be 
a binary clause {¥, x} € F’, contradicting that all D; contain at least three literals. 
So F is non-singular, and thus F is isomorphic to some F,, for some m > 2. So 
F is 4-variable-regular, where all the variables of Ff, have at least 9 occurrences in 
F’, and thus we have var(F,) © var(C/) for all i € {1,...,5}, which implies that 
actually var(C/) = var(F,) = var(Fo) holds. 

Coming back to the structure of Fo, we now know that Fo has five full clauses 
Ci,...,C%, which contradicts Lemma (12.14, Part Bl 


14.2. Sharpening the bound 


Based on recursion on potential degree-pairs, we can improve the upper bound 
nM(k) for pvd(MUs=;,) for k > 6 (generalising Example (13.4): 


Definition 14.4 Let nM, : N > N be defined as nM, := NM([2, 4,5, 6,8, 8]) (recall 


Definition (13.14). 


By Lemma together with Theorem we get: 


Theorem 14.5 For allk © N we have pnM(k) < nMj(k). 


It remains to determine nM, numerically: 


Theorem 14.6 In Table \} we find the values of nMi(k) for k < 30. We have 
nM,(k) =nM(k) fork ¢€ {2% —m+1:meEN,m> 3}, while fork =2™—m+1 
we have nM,(k) = nM(k) — 1 = 2”. 

k 1 2 3 4 5 6 7 8 9 10 11 12 18 14 #15 


nMi(k) |} 2 4 5 6 8 8 10 11 12 13 14 16 16 18 19 


k 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
nMy(k) || 20 21 22 23 24 25 26 27 28 29 30 32 32 34 35 


Table 3: Values of nMj(k) for & € {1,...,30}, in bold the jump-values (i.e., & € J), 
and underlined the changed values compared to nM(k); we see that directly after 
the jump we have stagnation, followed by a second jump. 


Proof: We show the formula for nM(k) via induction on k. Due to 2*-3+1=6 
it holds for k < 6. So assume k > 7. We show by induction on k > 7 the following 
two properties (which imply the assertions of the theorem): 


1. Fork =2™—m+1,m > 4, we have 


(a) PPam, (k,nM(k)) = 0 and 
(b) and PPnM, (k, nM(k) _ 1) # 0. 
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2. Otherwise ppm, (k,nM(k)) #0. 
By Lemma [13.13) we know nM, < nM. 


Part 1. We consider k = 2 —m+1,m> 4. We have nM(k) = 2” + 1. 

Part (a). To show ppyy,(2™ —m-+1,2”+1) = 0, we assume (€9,€1) € 
PPam, (2% —m+1,2%+1). Thus we know eg,e1 > 2, e9,¢. < 2% -—m-+1, 
eg te, = 2™+1, €9 < e1, whence eg < 2—!, and 


nMi(2"—m+1l—e-+1)+e.>2+41 (1) 


for both ¢ € {0, 1}. 
Case (a.1). Assume ep < 2-1! — 1, and thus e; > 2-14 2. 
From (Il) we get nM(2™ —m+1—e,+1)+e1 > 2”+1, where (using Corollary 


[.4): 


nM(2” —m+1l—e,+1)+e,>2"41> 
nM(2™ —m+1— (2% 1+2)+1)+2" '42>27+16 
oM(2™"-* —m) > 2? — 1, 


where by Corollary we have nM(2™~! — m) = nM(2”~! — (m—1)-1) = 
2™-1_ 2, and we obtained a contradiction, finishing Case (a.1). 

Case (a.2). It remains eg = 2”~1. From ({l)) we get nM, (2 —m+1—e 9+1)4 
ep > 2™+1, where 2™—m+1—e9+1 = 2™—m41-2™-141 = 2™-!_-(m—1)+1, and 
thus by induction hypothesis we get nM; (2 —m+1—e9+1)+e9 = 2™ 1+2™ b= 
2”, a contradiction. This concludes Part (a). 

Part (b). We show (2771, 2™~) € ppays, (kK, nM(k) — 1) > For this it remains 
to show nM,(2™ —m+1—2™-1+41)+2™-1! > 2™, and indeed nM; (2™ — m+ 
1— 2-141) =nM,(2™-1! — (m—1) +1) = 2™7! by induction hypothesis. This 
concludes Part 1. 


Part 2. k#2™—m-+1 for any m > 4. We have to show ppyyy, (k, M(k)) FO. 

Part (a). k =2™ —m-+ 2; thus nM(k) = 2” + 2. 

We have (2"—1,2—1 + 2) © ppam, (F,2™ + 2), due to nM, (2 —m +2 — 
am—1 4.1) 42-1 = nM, (2! — (m — 1) +2) 2-1 = 27-1 4949"-1 and 
nM, (2™ —m+2— (2-142) 41) +2142 = nM, (2"-1! —(m—1))42"142= 
gm— LL gm-1 ae 9. 
Part (b). & = 2™ —m+3; thus nM(k) = 2” 4 3. 
We have (271, 27—1 + 3) © ppam, (F,2™ + 3 due to nM\(2™ —m+3— 
2m—-h 41) 4 2™ 1 = nM) (2™ 1! — (m—1) +3) 4271 = 2™14342™1 and 
nM, (2" —m+3—(2"-143) +1) +2143 = nMi(2"-1—(m—1))+2™-143= 
gm-1 4. gm! + 3. 

For all remaining cases 


2° ~m+4<k<2™"! —(m+1) 


we show (eo, €1) := (h(k), inm(K)) € PPam, (k,0M(k)), as in Lemma Part {i}. 
Recall Corollary for the computation of inm(k). 

For the critical condition “nM, (k—e1+1)+e, > nM(k)” we recall kK—iym(k)+1 = 
i(k) (recall Definition (7.13), where 7’(k) is monotonically increasing. We want to 


15) We have (2™—1,2™—1) = (h(k), inm(k) — 1), but we don’t need this here. 
16) We have (2™—-!,2™—1 4 2) = (h(k) — 1,inm(k) + 1). 
17) We have (2™—1,2™—1 4 3) = (h(k) — 1, inm(k) +1). 
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show that 7’(k) is never equal to some argument where nM, ¥ nM, and thus we 
need to check the upper and the lower bounds: 

For k = 2™ —m-+4 holds ina (k) = 2"-!+2, thus i(k) =2™—m+4—(2m-14 
2)+1=2"-1_m43>2™1_(m—1)4+1. 

For k = 2+!_—(m+1) holds in (k) = 2™, thus i(k) = 2*1—(m41)-2™+1 = 
2% —m<2™—m+1. 

So all critical condition s for e; are fulfilled, and it remains to check eo. 

Part (c). 2™-m+4<k<2™*!—(m+4+1)-2. 

Here for the critical conditions “nMi(k — e9 + 1) + e9 > nM(k)” the values 
k—eo+1 are never equal to some argument where nM, 4 nM, since A(k—h(k)+1) = 
1 — Ah(k) > 0 for k < 2™+1 — (m +1) —2 by Theorem /7.15} and thus k — h(k) +1 
is monotonically increasing for the k-range we consider. 

Checking for the lower bound k = 2% —m+4: ipm(k) = 2™ 14 2, thus 
h(k) = nM(k) — inm(k) = 27+ 4-2™-1-—2 = 2™-1+4 2, which is the same as 
inm(k), and doesn’t need to be checked again. 

Checking for the upper bound k = 2+! — (m+ 1) — 2: inm(k) = 2” —1, and 
thus h(k) = nM(k) —inm(k) = 27+ — 3-241 = 2™ — 2, and so we check 
am+1 _ (m +1) —2—(2™—2)41=2™—m<2™—m+4+1. 

Part (d). k= 2™t! —(m+1)—1; thus nM(k) = 2™t! — 2. 

Now inm(k) = 2™, and so h(k) = nM(k) — inm(&) = 2 — 2, and so the critical 
argument is 2”+! —(m+1)—1-—(2™-—2)+1=2™"—m+1. It remains to check 
nM, (2 —m+1)+2™—2>2+1 _ 2, and indeed nM,(2™ —m+1)+2™-2= 
27 +2™ — 2. 

Part (e). k= 2™+t! — (m+ 1); thus nM(k) = 271. 

Now h(k) = 2” = inm(k), and no check is needed. 


It is instructive to note the new A-values explicitly: 
Corollary 14.7 Fork © N holds AnM,(k) € {0,1,2}, with 
1. AnMi(k) =0 = k=2™-—mM for somemeEN, m>3. 


2. AnMi(k) =2 =} k=2™—m+1 for somemeEN, m>3. 


15 Conclusion and open problems 


The main subject of this report can be seen in the study of pvd(Cs—;,) for classes 
UHIT CCC MLEAN and k €N, that is, the study of the maximal minimum 
variable-degree of classes of matching-lean clause-sets containing all minimally un- 
satisfiable clause-sets, parameterised by the deficiency. If C C LEAN, then this 
quantity is bounded, and indeed we have shown pvd(LEAN5=~) = nM(k) (more 
generally this holds for every subclass of LEAN containing VMU). While for 
C= MCLEAN this quantity is unbounded. For MU we have shown the improved 
bound pvd(MUs=~) < nMi(k), where indeed also this bound is not sharp (as will 
be shown in [68]; see Subsection f15.2) — the question about the determination of 
pnM(k) = pvd(MUs=,) is a major open research question. 

For lean clause-sets we have shown the strengthened upper bound pvd(F’) < 
nM(o(F’)), and indeed for every clause-set F' we can satisfiability-equivalently re- 
move some clauses in polynomial time such that this upper bound holds. 


15.1 Conjectures and questions 


We made the following conjectures: 
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1. Conjecture [10.3 If a clause-set violates the upper bound on the min-var- 
degree for lean clause-sets, then it must have a non-trivial autarky. As we have 
seen, we can determine the set of variables involved, but the determination 
of the autarky itself is open — the conjecture states that there is a poly- 
time algorithm for computing such an autarky. See Subsection for more 
information on this topic. 


2. Conjecture [12.9 the maximum min-var-degree for unsatisfiable hitting clause- 
sets is the same as for the larger class of minimally unsatisfiable clause-sets. 
In Conjecture we generalise this to non-boolean clause-sets. 


3. Conjecture [12.3} nM is not far away from the unM, more precisely, nM —1 < 
puM < nM. The stronger Conjecture [1.2.5 the same holds even for the 
maximal number of full clauses, that is, nM(k) — 1 < vic(MUsex) < nM(k) 
for all k € N. In Lemma we will state a weaker, but proven (in future 
work) lower bound. 


Five more conjectures will be presented in this final section. We asked also the 
following questions: 


1. Question is about some complexity problems around the elimination of 
literal occurrences in minimally unsatisfiable clause-sets. 


2. Question D.dlis about the complexity of SAT decision for SED. At first sight it 
might seem easy to translate every F' € CLS into some sat-equivalent element 
of SED, and in fact to manipulate deficiency and surplus alone is rather easy, 
but we do not know how to handle them together. 


3. Question concerns the existence of unsatisfiable hitting clause-sets of 
arbitrary surplus equal deficiency and a min-var-degree as low as possible. 
An underlying question is to understand better the quantity “surplus”. 


4. Question f[1.2jis about strengthening the construction of Lemma|l1.1, for find- 
ing matching-lean clause-sets of high minimum literal-degree (perhaps com- 
pletely different constructions are needed). 


In the remainder we outline main research areas related to the topics of the 
report. 


15.2. Improved upper bounds for nM 


We know pnM < nMj, and we know pnM(k) precisely for k € {1,...,6} and for 
k € J, J +1. Also of high relevance here is to determine pvd(“HZTs=,,), which by 
Conjecture is the same on ynM(k). Another major conjecture is Conjecture 
2.9, which says that wnM deviates at most by 1 from nM. Beyond this article, we 
know the following improvements of the upper bound nM;: 


e Generalising the ideas of Theorem fi4.3} which is based on the improved upper 
bound for deficiency 2? — 3 + 1 = 6, we can show also for deficiency k = 
24442 = 14 that we have wvd(MUsex) = nM(k) —1. Via the non- 
Mersenne operator, this yields the improved upper bound nMg. 


e Altogether we obtain a sequence of improved upper bounds nM,,_2 form EN, 
m > 3, improving the upper bound at deficiency k = 2 — 2 for nM,,-3 and 
applying the non-Mersenne operator. 


e The infimum of nM,,nMo2,... is nM,,. This will be developed in (63). 
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e However, this is not the end of it — also for deficiency k = 15 we have 
pvd(MUs=,) = nM(k)—1, obtaining nM,,41. This new improvement depends 
on new ideas — will there be an infinite chain of ever-increasing complexity 
of such improvements? 


We believe that a closed “nice” formula for uwnM(k) is impossible, but that 
however computation is possible: 


Conjecture 15.1 The function ppM:N—N is “complex”, and for no finite tuple 
G@ holds NM(d@) = wnM, however unM is computable in doubly-exponential time. 


See Lemma for some conditions which imply the computability-part of Con- 


jecture fi5.1). 


15.3. Determining vfc(MU;_;) 


While Subsection was about improving the upper bound, here now we turn 
to the lower bound. In Subsection we provided only the minimum needed in 
this report for the measure fc(F’) of full clauses. In the forthcoming (66 we show 
the following lower bound, using Sy : N > N, the function for the “Smarandache 
Primitive Numbers” introduced in (4, Unsolved Problem 47], which for k € N is 
defined as the minimal natural number s € N such that 2” divides s!. 


Lemma 15.2 ([66]) For all k € N holds vicCUHITs=x) > S2(k). 


Lemma yields the interesting inequality Sz < wnM < nM. This is relevant as 
the upper bound nM on $2 as well as the lower bound S2 on wnM. From pa we 
get that k + 1 < S2(k) and thus by Corollary we get 


Lemma 15.3 ([66]) & +1 < S5(k) <nM(k) <k+1+fid(k) fork EN. 


Recall that in Lemma we obtained a much sharper lower bound for unM(k) 
from Conjecture For sequences a,b: N > R let asymptotic equality be denoted 
by a ~ b: limn—+oo aa =1. 


Corollary 15.4 (66) ) The siz sequences So(k), vic(UHTTs=x), vic(MUs=x), 
pvd(MUs=x), pnM(k), nM(k) are asymptotically equal to (k)nen (these are known 
facts for S2(k) and nM(k)). 


In Figure fi we show the six quantities from Corollary and the relations 
between them (we do not mention the dependencies on the deficiency k € N there). 
An arrow means a (proven) <-relation. If the arrow is labelled with “—1”, then we 
conjecture the difference is at most —1 (while in all three cases we know cases where 
the difference is equal to —1), the label “=” means that we conjecture equality, and 
the label “oo” means that we conjecture that the difference is unbounded. 

For a more precise asymptotic determination of these six quantities from Corol- 
lary calling them a;,, we need to consider the six sequences a; — k. Currently 
we only know nM(k) — k ~ ld(k). 


15.4 Generalisation to non-boolean clause-sets 


It is interesting to generalise Theorem for generalised clause-sets; see 64, | 
for a systematic study, while the most general notion of generalised clause-sets, 
“signed clause-sets” are discussed in i. Generalised clause-sets F’ have literals 
(v,e), meaning “v # e”, for variables v with non-empty finite domains D, and 
values e € Dy. The deficiency is generalised by giving every variable a weight 
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pvd(UHTT ) 


we 


vic(UHZT ) 


|- 


So 


Figure 1: The four main combinatorial quantities, and the two numerical functions 


|Dy|—1 € No (which is 1 in the boolean case), i.e., 0’) = e(F) — Vo yevar(ry (|Dol — 
1) = c(F) + n(F) — Yvevar(r)|Dol; see [b7, Subsection 7.2]. A partial assignment 
is a map y with some finite set of variables as domain dom(y) =: var(y), which 
maps v € var(y) to y(v) € D,. A partial assignment ¢ satisfies a clause-set F 
iff for every C € F there is (v,¢) € C with v € var(y) and y(v) # «. Minimally 
unsatisfiable (generalised) clause-sets are defined as usual (they are unsatisfiable, 
while every strict subset is satisfiable). In 64, Corollary 9.9] it is shown that also 
all minimally unsatisfiable generalised clause-sets F fulfil 6(£) > 1 (based, like in 
the boolean case, on matching autarkies). 

The degree vdr(v) of a variable v in a clause-set F' is the sum of the degrees of 
the literals (v,e) for e € Dy, and thus vdr(v) = |C € F: CA({v} x Dy) 4 |. Fora 
given deficiency & € N, the basic question is to determine the supremum of pvd(F’) 
over all minimally unsatisfiable F' with 6(F’) = k. The base case of deficiency k = 1 
is handled in Ba, Lemma 5.4], showing that for generalised minimally unsatisfiable 
clause-sets of deficiency 1 we have pvd(F’) < max,¢yar(r)|D»|; actually all structural 
knowledge from B (17, | has been completely generalised in Ba, Subsection 5.2]. 

But k > 2 requires more work, since here the basic method of saturation is not 
available for generalised clause-sets, as discussed in Subsection 5.1 in ba: saturated 
generalised clause-sets (i.e., unsatisfiable clause-sets, where no literal occurrence 
can be added without rendering the clause-set satisfiable) with deficiency at least 
2 after splitting do not necessarily generate minimally unsatisfiable (generalised) 
clause-sets. Thus the proofs for the boolean case seem not to be generalisable for 
arbitrary minimally unsatisfiable (generalised) clause-sets. 

In order to repair this, the “substitution stability parameter regarding irredun- 
dancy” sir(F’) € Zs_1W{+o0} is introduced in Subsection 5.3]), defined as 
the supremum of k € Zs_1 such that for every partial assignment with n(y) := 
|var(y)| < k the clause-set y * F', obtained as usual by application of y to F, 
is minimally unsatisfiable. So sir(F’) > 0 iff F is minimally unsatisfiable, and as 
shown in Corollary 4.8], sir(F') = +00 iff F is a hitting clause-set (i.e., for all 
C,D¢F,C#D, there are x € C, y € D with x = (v,¢) and y = (v,¢’) for some 
variable v and ¢,¢’ € Dy, with « # «’). And sir(F) > 1 iff splitting on any variable 
yields always a minimally unsatisfiable clause-set. So for a boolean clause-sets F 
holds sir(F’) > 1 iff F is saturated, but for generalised clause-sets we only have that 
sir(F’) > 1 implies saturatedness (pS, Corollary 5.3]). 
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In Corollary 5.10 in one finds a generalisation of the basic bound pvd(F’) < 
26(F) for the boolean case. Namely pvd(F’) < max,eyar(r)|Dv| + 6(F) is shown for 
F with sir(f) > 1. Since for (generalised) saturated F with 6(F') = 1 we have 
sir(F’) = oo (ba, Corollary 5.6]), this covers the above mentioned result pwvd(F) < 
MaXyevar(F)|D»| for (arbitrary) minimally unsatisfiable F with 6(F’) = 1 (note that 
here saturation works as in the boolean case). 

In [ba we concentrate on unsatisfiable hitting (generalised) clause-sets, and via 
generalised non-Mersenne numbers nM“ (k) we are able to generalise Theorem 
to generalised clause-sets. We believe that in general the minimum variable degree 
of minimally unsatisfiable clause-sets F with sir(f’) > 1 for a given deficiency is 
always obtained by unsatisfiable hitting clause-sets (generalising Conjecture f12.3)): 


Conjecture 15.5 Let UHIT_, denote the set of generalised unsatisfiable hitting 
clause-sets of deficiency k € N and with uniform domain-size d € N, and let 
MUS kg sir>1 be defined in the same way. Then we have for all k,d € N that 


uvd( UHITS;,) = pvd(M Us-x,sir>1): 


Furthermore, the “2” in $5(k) in Lemma is related to the boolean domain, 
and generalising the results of this report in [66] to the non-boolean domain sheds 
light on Sa(k) (the minimal s € N such that d* divides s!) for arbitrary prime 
numbers d € N, as introduced in [92 Unsolved Problem 49] (while for non-prime- 
numbers d the definition of Sq has to be generalised). See Subsection III.1 in By 
for basic properties of 5,,(k). 


15.5 Classification of MU 


As mentioned in the introduction, a major motivation for us is the project of the 
classification of minimally unsatisfiable clause-sets in the deficiency (recall Examples 


B.4, B.3), where the main conjecture is: 


Conjecture 15.6 For every deficiency k € N there are finitely many “patterns” 
which determine the nonsingular elements of MUs=x, as well as the saturated and 
hitting cases amongst them. Especially for every k the isomorphism types of MU;_;, 
can be efficiently enumerated (without repetitions), and for any given F € MUS_; 
its isomorphism type can be determined in polynomial time. 


Conjecture has been shown for & < 2 (recall Examples B.A, B.3). As we dis- 
cussed in Subsection [1.6.1] the translation e : CLS — HYP has the property 
Fe Musser & e(F) € MNC5,=%-1 for k € N, and so the classification of MUs=% 
can be seen as a subtask of the classification of MNC;5,,=~ for k € No. The possibil- 
ity of a characterisation of MNC5,,—9 was already raised in Al (where concentration 
on the special case of saturated (“strong” there) minimally non-2-colourable hyper- 
graphs was recommended), but is indeed still outstanding, which is understandable, 
given that polytime decision of MU;5_ is easy when compared with polytime deci- 
sion of MNC;,,=0. In the other direction, the consideration of MUs5, =~ C MUs<x, 
(recall Subsection could be a stepping stone (recall MU5,, =1 = UHTTs5=1). 

A major step towards Conjecture should be the classification of unsatis- 
fiable hitting clause-sets in dependency on the deficiency. We remark here that 
unsatisfiable hitting clause-sets do not seem to have a close correspondence in hy- 
pergraph colouring, due to the lack of complementation in hypergraphs. Here the 
main conjecture (which should follow from Conjecture (15.4, once we found the pre- 
cise formulation of “finitely many patterns” ) is: 
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Conjecture 15.7 For every deficiency k € N there are only finitely many iso- 
morphism types of non-singular unsatisfiable hitting clause-sets, or equivalently, the 
number of variables of elements of UHLT{_;,, is bounded. 


For k < 2 finiteness has been established (Examples 8.3), while recently we 
were able to prove it for k = 3 ((67}). Assuming Conjecture {15.7} the question arises 
about the computability of the function, which maps k € N to the set of isomorphism 
types. Equivalently one can consider the computability of any function, which maps 
k € N to an upper bound on the number of variables of elements of VHITs_,,. It is 
conceivable that such functions must grow so quickly that they are not computable, 
we however believe that actually a very small bound holds, and we conjecture the 
following strengthened form of Conjecture [5.7 


Conjecture 15.8 For every k € N and every F €UHTT;_,, holds n(F) < 4k — 5. 


This conjecture together with the other conjectures implies computability of 
pnM (using Corollary 5.9): 


Lemma 15.9 Assume that Conjecture holds. 


1. Then the map k € N +> pvd(UHTTs=~) is computable, by enumerating all 
possible clause-sets F' with at most 4k —5 variables, checking whether they are 
in UHTT;_,, and if so, including uvd(F’) into the maximum-computation. 


2. If also Conjecture [12.4 holds, then the function unM is also computable. 


Conjecture says additionally, that although nM is computable, it should be 
“complex”. 
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